TL;DR -
yes, you are correct that it sounds that way in the video, but it's not that way in person because of lots of stuff. Onwards....
This is actually an issue in the compression coding of the audio on the video. The supersonic crack from the suppressed rifle is more pronounced because the braked rifle is already limiting out across more frequency ranges. The mic can only pick up a maximum amount of sound pressure, and the codecs applied to the raw file will limit frequencies to maximum amplitudes very aggressively to compress the data.
The supersonic crack of the 300 Norma is at least the same if not louder than the 6mm Creedmoor, but it's buried in the overall noise signature in a way a video camera can't record. You are 100% correct in that there is apparently a louder sound in the suppressed video, but that's not what happens if you're physically present.
View attachment 318726
The video does not let you hear all the sound that happens above and below the threshold. In person, this is a range of sound pressure that you could tell a difference between it and something lower. No sound can actually exceed 192 dB SPL because of lots of reasons, one limit is that the vacuum side of the wave can't go any lower than a certain point, the other is that sound waves will exert pressure on objects at a certain level and you won't perceive more loudness but will actually absorb it. That's the point were sound pressure will physically compress or potentially tear the tympanic membranes in your ears. That's why ears and noses bleed when bombs go off - the pressure wave is absorbed by fluids in the skin/membranes and capillaries rupture. Sound and shock waves are two different magnitudes of the same concept.
If you watched the Rittenhouse trial, a large part of the videography expert testimony was about how the visual frames from the camera and the audio don't sync when you break the file down because of the codec used during recording. Reduced down, it's because sound has to be perceived over time while an image can be frozen. You can't stretch sound out as easily as pulling up a single frame of video because the frequencies will be distorted as you stretch the timeframe of the audio. Software would have to interpolate sound data that isn't there to make it line up, and anything that could be described as "making something up" is NOT something you want used in a courtroom (not that that prosecutor would have failed any harder even if he had said that
). Long story, there's no way to exactly place gunshots in a slowed down video, so it's not clear and incontrovertible evidence against self defense. My point - if the encoding software compressed the audio, there's no way to retrieve an uncompressed sample of the data meaning you can't identify the exact moment certain audible events happen if they happen above or below the compression threshold. In the unsuppressed video, there's no way to pull out the uncompressed data and do an accurate comparison of the data.
A gunshot video is very similar to recording a drummer in a studio - it's difficult to measure amplitude of the input frequencies with a small microphone relatively far away from the noise source, so you have to use as big a mic as close as practical to the source. You don't record a drum set with a single mic over head, you have to mic each drum individually to pick up the lower frequencies. Cymbals you can mic overhead because the higher range frequencies propagate better. If you want to capture high resolution audio data of a gun shot, a cell phone 15 feet away isn't going to get you any information to make a meaningful comparison.
Isn't it cool how our senses straight up lie to us?