Война форматов. DSD против всех или все против него ?

А у меня такой опыт. Ради интереса а плеере выставил увеличение частоты кратной целому коэффициенту. Так же пробовал плеером конвертацию налёту в dsd поток 128, 256. Во всех случаях звук снижал свою колкость, грубость, становился как-то помягче, приятнее, но при этом микродетали читались как-то легче, отчётливее. В целом звук становится более приятным, комфортным, микродетальным. Единственный минус - ноут старенький, не вытягивает все это, периодически начинает заикаться. Пришлось убрать

2 лайка

Вы правильно описали суть положительного результата от апскейлинга и конвертации PCM формата в DSD поток. Но корректнее было бы это делать не налету, а заранее до воспроизведения. Подобные операции налету создают дополнительную нагрузку на процессор, что приводит к дополнительной грязи, набрасываемой процессором в тракт в виде прибавки в отношении джиттера и помех.

Бояре, вот такой вопрос возник у меня недавно. Есть SACD вот такой Janine Jansen, Vivaldi – The Four Seasons (2004, SACD) - Discogs. И вот смотрите, что там написано:

Hybrid Super Audio CD containing SACD Surround (Multi-channel 96 kHz/24 bit PCM recording), SACD Stereo (Stereo 96 kHz/24 bit PCM recording) and CD Audio.

Это что получается, они взяли хайрез в PCM и перевели его в DSD? Но если так, то нафига слушать DSD, если при конвертации неизбежны потери? И как не угодить в такое попадалово с другими SACD, где не указан источник? Как проверять?

1 лайк

Только на слух. После появления SACD проигрывателей с хорошими DSD цапами на борту многим понравилось слушать PCM конверт + ещё раз диск продать можно.
Сейчас приходится отталкиваться от файлов, что есть в наличии + реализации самого цапа. Есть цапы, что лучше играют DSD, чем PCM - но их меньшинство.

Просто до смешного доходит. У меня ЦАП внутри делает собственный конверт DSD → PCM. Получается, слушая такие сакды, я слушаю двойную конвертацию.

1 лайк

Chord так не делает, у него свой внутренний формат.
С DSD он справляется традиционно лучше, чем с PCM, ибо он один из видов реализации дельта-сигма.

Где вы это прочитали? Помнится, даже сам Роб Ваттс такое говорил, хотя источник сейчас не предоставлю. Ну и не думаю я, что M-Scaler умеет апсамплить DSD, если такое вообще возможно для DSD.

И ещё вот мнение самого Роба про DSD:

Спойлер

There are actually two independent issues going on with DSD that limits the musicality - and they are interlinked problems.

The first issue is down to the resolving power of DSD. Now a DSD works by using a noise shaper, and a noise shaper is a feedback system. Indeed, you can think of an analogue amplifier as a first order noise shaper - so you have a subtraction input stage that compares the input to the output, followed by a gain stage that integrates the error. With a delta sigma noise shaper its exactly the same, but where the output stage is truncated to reduce the noise shaper output resolution so it can drive the OP - in the case of DSD its one bit, +1 or -1 op stage. But you use multiple gain stages connected together so you have n integrators - typically 5 for DSD. Now the number of integrators, together with the time constants will determine how much error correction you have within the system - and the time constants are primarily set by the over-sample rate of the noise shaper. Double the oversampling frequency and with a 5th order ideal system (i.e. one that does not employ resonators or other tricks to improve HF noise) it converges on a 30 dB improvement in distortion and noise.

So where does lack of resolution leave us? Well any signal that is below the noise floor of the noise shaper is completely lost - this is completely unlike PCM where an infinitely small signal is still encoded within the noise when using correct dithering. With DSD any signal below the noise shaper noise floor is lost for good. Now these small signals are essential for the cues that the brain uses to get the perception of sound stage depth - and depth perception is a major problem with audio - conventional high end audio is incapable of reproducing a sense of space in the same way one can perceive natural sounds. Now whilst optimising Hugo’s noise shaper I noticed two things - once the noise shaper performance hit 200 dB performance (that is THD and noise being -200 dB in the audio bandwidth as measured using digital domain simulation) then it no longer got smoother. So in terms of warmth and smoothness, 200 dB is good enough. But this categorically did not apply to the perception of depth, where making further improvements improved the perception of how deep instruments were (assuming they are actually recorded with depth like a organ in a cathedral or off stage effects in Mahler 2 for example. Given the size of the FPGA and the 4e pulse array 2048FS DAC, I got the best depth I could obtain.

But with Dave, no such restriction on FPGA size applied, and I had a 20e pulse array DAC which innately has more resolution and allows smaller time constants for the integrator (so better performance). So I optimised it again, and kept on increasing the performance of the noise shaper - and the perception of depth kept on improving. After 3 months of optimising and redesigning the noise shaper I got to 360 dB performance - an extraordinary level, completely way beyond the performance of ordinary noise shapers. But what was curious was how easy it was to hear a 330 dB noise shaper against a 360 dB one - but only in terms of depth perception. My intellectual puzzle is whether this level of small signal accuracy is really needed, or whether these numbers are acting as a proxy for something else going on, perhaps within the analogue parts of the DAC - I am not sure on this point, something I will be researching. But for sure I have got the optimal performance from the noise shaper employed in Dave, and every DAC I have ever listened too shows similar behaviour.

The point I am making over this is that DSD noise shapers for DSD 64 is only capable of 120 dB performance - and that is some 10 thousand times worse than Hugo - and a trillion times worse than Dave. And every time I hear DSD I always get the same problem o perception of depth - it sounds completely flat with no real sense of depth. Now regular 16 bit red book categorically does not suffer from this problem - an infinitely small signal will be perfectly encoded in a properly dithered system - it will just be buried within the noise.

Now the second issue is timing. Now I am not talking about timing in terms of femtosecond clocks and other such nonsense - it always amuses me to see NOS DAC companies talking about femtosecond accuracy clocks when their lack of proper filtering generates hundreds of uS of timing problems on transients due to sampling reconstruction errors. What I am talking about is how accurately transients are timed against the original analogue signal in that the timing of transients is non-linear. Sometimes the transient will be at one point in time, other times delayed or advanced depending upon where the transient occurs against the sample time. In the case of PCM we have the timing errors of transients due to the lack of tap length in the FIR reconstruction filter. The mathematics is very clear cut - we need extremely long tap lengths to almost perfectly reconstruct the original timing of transients - and from listening tests I can hear a correlation between tap length and sound quality. With Dave I can still hear 100,000 taps increasing to 164,000 taps albeit I can now start to hear the law of diminishing returns. But we know for sure that increasing the tap length will mean that it would make absolutely no difference if it was sampled at 22 uS or 22 fS (assuming its a perfectly bandwidth limited signal). So red book is again limited on timing by the DAC not inherently within the format.

Unfortunately, DSD also has its timing non-linearity issues but they are different to PCM. This problem has never been talked about before, but its something I have been aware of for a long time, and its one reason I uniquely run my noise shapers at 2048FS. When a large signal transient occurs - lets say from -1 to +1 then the time delay for the signal is small as the signal gets through the integrators and OP quantizer almost immediately. But for small signals, it can’t get through the quantizer, and so it takes some time for a small negative signal changing to a positive signal to work its way through the integrators. You see these effects on simulation, where the difference of a small transient to a large transient is several uS for DSD64.

Now the timing non linearity of uS is very audible and it affects the ability of the brain to perceive the starting and stopping of instruments. Indeed, the major surprise of Hugo was how well one can perceive that starting and stopping of notes - it was much better than I expected, and at the time I was perplexed where this ability was coming from. With Dave I managed to dig down into the problem, and some of the things I had done (for other reasons) had also improved the timing non-linearity. It turns out that the brain is much more sensitive that the order of 4 uS of timing errors (this number comes from the inter-aural delay resolution, its the accuracy the brain works to in measuring time from sounds hitting one ear against the other), and much smaller levels degrade the ability for the brain to perceive the starting and stopping of notes.

But timing accuracy has another important effect too - not only is it crucial to being able to perceive the starting and stopping of notes, its also used to perceive the timbre of an instrument - that is the initial transient is used by the brain to determine the timbre of an instrument and if timing of transients is non-linear, then we get compression in the perception of timbre. One of the surprising things I heard with Hugo was how easy it was to hear the starting and stopping of instruments, and how easy it was to perceive individual instruments timbre and sensation of power. And this made a profound improvement with musicality - I was enjoying music to a level I had never had before.

But the problem we have with DSD is that the timing of transients is non-linear with respect to signal level - and unlike PCM you are completely stuck as the error is on the recording and its impossible to remove. So when I hear DSD, it sounds flat in depth, and it has relatively poor ability to perceive the starting and stopping of notes (using Hugo/Dave against PCM). Acoustic guitar sounds quite pleasant, but there is a lack of focus when the string is initially struck - it sounds all unnaturally soft with an inability to properly perceive the starting and stopping. Also the timbre of the instrument is compressed, and its down to the substantial timing non-linearity with signal level.

Having emphasised the problems with delta-sigma or noise shaping you may think its better to use R2R DAC’s instead. But they too have considerable timing errors too; making the timing of signals code independent is impossible. Also they have considerable low level non linearity problems too as its impossible to match the resistor values - much worse than DSD even - so again we are stuck with poor depth, perception of timing and timbre. Not only that they suffer from substantial noise floor modulation, giving a forced hard aggressive edge to them. Some listeners prefer that, and I won’t argue with somebody else’s taste - whatever works for you. But its not real and it not the sound I hear with live un-amplified instruments.

So to conclude; yes I agree, DSD is fundamentally flawed, and unlike PCM where the DAC is the fundamental limit, its in the format itself. And it is mostly limited by the format. Additionally, its very easy to underestimate how sensitive the brain is to extremely small errors, and these errors can have a profound effect on musicality.

Rob

1 лайк

Это тот самый случай, когда много знаешь - плохо спишь :slight_smile: Давайте еще подкину, в процессе подготовки диска любого формата, исходный материал N-е количество раз преобразуется и сколько и как никто не знает точно не скажет, причем эти преобразования как явные по нажатию кнопки в программе, так и не явные, при прохождении данных через железки, где внутри имеются свои процессоры и программы, далее все преобразуется на входе плеера в приемниках, фильтрах, DSP и самой микросхеме/схеме ЦАП тоже не явным для пользователя виде. И как теперь с этим жить то? :slight_smile:

1 лайк

Ну вот и конец войне форматов, peace! Теперь цапы сами что хотят, то и делают, как хотят, так и играют. ))

На самом деле так и есть. К примеру у Holo Audio цапов R2R матрица для PCM и отдельная резисторная лестница для DSD - так что по сути в одном корпусе 2 разных цапа. К тому же и “прогревать” их нужно отдельно, проверил :wink:

1 лайк

Не ну я видел очень редкие сакды, где написано direct to dsd. Полагаю, с ними не должны ничего делать. Но они прям очень редкие.

1 лайк

В первоисточниках… совсем недавно приходилось отвечать на это.

Казалось бы, причем тут Dave, если у меня TT 2?

Они идентично построены, для справки.
TT — упрощённый вариант DAVE.

1 лайк

Откуда у вас эти сведения? Можно источник? Упрощать можно по-разному. Уже не один раз встречаю на head-fi и на roon commmunity утверждение, что в TT-2 идет конверт dsd → pcm.

1 лайк

Каждая картинка легко находится гуглом… это публичная информация.
Как и про сами ЦАПы.
В том моём сообщении, кстати, приложена ссылка на интервью.

Хорошо… сам найду…

Let’s talk about the latter first. The TT2, like its cousins in the Chord Electronics lineup, eschews off-the-shelf DAC chips and implements all its digital logic in an FPGA (Field Programmable Gate Array). But there the similarity with other FPGA DACs ends. Watts hews to his own design philosophy, with the prime guiding principle being to recreate the original analog waveform with the least amount of timing transient errors. His proprietary WTA (Watts Transient Aligned) filter algorithm is a method to approach the performance of an ideal, infinite tap length FIR (fixed impulse response) sinc filter, using a large but finite number of “taps,” or coefficients, that can be accommodated within the storage and computational capacity of the FPGA. I lack the expertise to go into further detail, but Watts has kindly provided us with his slide decks (linked below) where he delves into his technology. You can also find videos of his presentations by searching online. Finally, the Chord Electronics M Scaler product page links to an informative paper that describes this design approach.

In the context of Chord Electronics DACs, the tap length has become a metric of evolution of the WTA filter. In the latest lineup, the Hugo 2 and Qutest DACs implement a tap length of 49,152, the TT2 implements 98,304 taps, the DAVE 164,000 taps, and the Blu MkII CD Player and Hugo M Scaler implement just over 1 million (1,015,808) taps. The actual D/A conversion in the TT2 is done with a 10-element pulse array. As a point of reference, the flagship DAVE uses a 20-element pulse array.

Понятно-ясно. Я вас так послать тоже могу, толку-то? Про M-Scaler комментарии будут, или вы это аккуратно пропустили?

Добавил тех.инфу с цифрами…

Уточните свой вопрос о нём, пожалуйста.

Но про него полезно тоже прочесть первоисточник.

The-theory-behind-M-Scaler-technology.pdf (415.7 KB)

Хорошо, я не буду утверждать, что я детально разбирался в технологии, но что-то мне подсказывает, что для декодирования DSD не нужно иметь дельта-сигма модулятор, ибо вся работа уже сделана. Буквально моё предложение о том, что идет конверт dsd → pcm, воспринимать, наверное, не стоит. Но и утверждать то, что корд ничего не делает с dsd, тоже нельзя, судя по той же блок схеме, которую вы привели, ну и по словам самого Ваттса о dsd в целом.

Так что мой посыл был скорее про то, что слушать конверт (сделанный как-то, не знаю как) pcm → dsd в цапе, который с dsd делает тоже что-то своё непонятное, скорее всего не очень эффективно. Вот у меня в коллекции где-то 20 альбомов в DSD, из них достоверно известно о трёх, записанных direct to dsd. И вот что делать с остальными? Оставить как есть или же взять PCM версию в рэд бук формате? Что будет в итоге лучше? Да, DSD64 почти все.

Есть зависимость результата от того, с каким форматом конкретный ЦАП работает лучше.
Вклад в итоговое звучание зачастую у этого больше, чем у операций, предшествующих появлению записи на физическом носителе.

3 лайка