andai 2 days ago

Many moons ago I became quite obsessed with analyzing spectrograms on my computer.

I would load up audio files in Audacity and look at them to see how the audio "looked", as a function of how intense each frequency is over time.

You can even set a track to spectrogram while recording which allowed you to see the sound in real time.

Music also tends to be very beautiful in the spectrogram! And birdsong also. Sometimes I would see a bird first, and only afterwards notice it in my field of hearing.

I noticed while analyzing a podcast that I began to recognize common words like "you." I also noticed that I was able to easily distinguish between different people's voices.

I had to wonder if I were deaf, or if I become deaf, I would suddenly have a strong motivation to learn how to read these things. To develop some kind of device which would show them to me 24 hours a day.

I have not done this, but the project has remained in the back of my mind for over a decade.

Does anyone else know more about this? Does such a device exist?

I think that only some linguists learn how to read spectrograms. But it seems like something that might be extremely useful to any hearing impaired person?

Relating to the article, I think one could quickly learn to read them fluently (e.g. as subtitles, perhaps overlaid on real life), and of course you get the tonal information built in for free—that's what a spectrogram is!

  • AndrewOMartin 2 days ago

    You're on the fringe of an area which in academia is called Sensory Substitution. A simplification of which is experiencing one of the five senses using different sense organs than usual. Classic examples of this are video cameras which represent their image as a matrix of vibrations on the subjects skin or as a sound.

  • kiicia 2 days ago

    There was a guy who was able to recognize music just by looking at grooves of vinyl recording https://en.wikipedia.org/wiki/Arthur_Lintgen

    • m463 a day ago

      I remember being able to recognize one song on vinyl.

      It was a (telarc I think?) recording of the 1812 overture.

      The grooves were wide where the canons went off, so that the needle could deflect enough to capture the dynamic range. You could see the waveform.

      I think of "Surely You're Joking Mr. Feynman" where people could sniff like a bloodhound. Feynman would have people handle books, and he could tell which ones had been handled.

      I think there are things that just trying would be successful more than you think.

shomp 2 days ago

The book Understanding Comics by Scott McCloud is a tremendous study in this area, Scott shows how you can add abstract meanings to words and pictures through illustration.

foofoo12 2 days ago

Very interesting idea. I remember reading that in visual spoken communications, only 20% is the actual words. The rest is tone of voice, body language, context, emphasis, expressions, ... all that stuff.

I don't know if 20% is correct, but I feel it's very close to it. I also think a lot of internet arguments happen as a direct result of miscommunication. Emojis are great, but they get abused to the point that HN filters them out. Perhaps allow readers to toggle if they want to see emojis or not?

  • Isognoviastoma 2 days ago

    Easy to check: try to speak with someone talking foreign language you don't know and estimate what percentage of what they said you understood from tone of voice etc. I would guess it's less than 80%.

    • foofoo12 2 days ago

      That's very easy and very wrong. Let's say you have a 100 page book. Page 1 contains fundamental knowledge that allows you to understand the rest of it. If you skip page 1 then you won't understand the other 99.

      How much of the book will you understand if you only read page 1?

      • kalavan a day ago

        That then raises the question: what is a unit of communication?

        If communication is 20% verbal and 80% nonverbal, and if communication is very nonlinear in understanding (as with your book example), how do we know what 1% of communication is? What does it mean, and how can we tell that the figure is correct, when our main or only way of detecting whether communication succeeded is through understanding or lack thereof?

        • foofoo12 a day ago

          > when our main or only way of detecting whether communication succeeded is through understanding or lack thereof

          That's not even a good test, due to miscommunication. Both parties might think it succeeded, but then much later on you find out the truth (maybe).

      • ethmarks a day ago

        But tonal information can be parsed without lexical understanding and vice versa.

        Somebody cursing in French can still be interpreted as anger even if you don't understand French, and written profanity can still be interpreted as anger even if you didn't hear it spoken.

        Tone and language do complent each other, but neither is a prerequisite for the other like your book analogy would suggest.

        • foofoo12 a day ago

          > but tonal information can be parsed without lexical understanding

          Parsed perhaps, but it's so context sensitive that it's not useful, save for extremities. The same tone of voice can have so many meanings based on what's actually being said and yet another if you add context.

    • cenamus 2 days ago

      Maybe also control for cultural similarity, but I definitely agree

  • eszed 2 days ago

    There's an acting exercise (it's from Joan Littlewood via Clive Barker) where one speaks "gibberish" - making language sounds, but not words - which, almost automatically, once they drop their terror of doing it, opens students up to all of those other avenues of communication. Later, you can switch students back and forth between the script and gibberish, and it becomes plain that if you can't play a scene as clearly (to those in it, not considering the audience) in gibberish as you can with words then you don't fully understand it.

failrate 2 days ago

Comic books already use changes in font, weight, size, of text and the shape of the word balloon to indicate tone and expression.

mati365 2 days ago

Consider learning Polish. Kurwa sounds exactly as it looks.

voxleone 2 days ago

Emojis absolutely have their place here. They can add tone, nuance, and a bit of humanity where plain text can feel flat.

  • embedding-shape 2 days ago

    I feel like emojis is the lazy persons way of adding tone, nuance and humanity, when you don't know how to do so by only writing. Don't want to imply it's wrong, it's valid to be lazy, especially when it comes to improving communication, but I find myself thinking "How can I make sure this comes across as the joke it is?" and after one or two minute I just end up slapping a wink emoji at the end and don't rewrite the text at all, as the lazy person I am.

    • jonplackett 2 days ago

      When you only want to write w a single word back though + and emoji, there’s not a lot of space to add tone!

    • pnut 2 days ago

      An idea compressed down into a single character is elegant and efficient.

realty_geek 2 days ago

I've always wondered about this.

In Akan languages it is not difficult to conceive of how the same word can be written in different ways to convey another dimension.

Anyone who speaks an akan language will understand that each of these words below means good but with a slightly different emphasis.

papa papaaapa papapapapapa

What is the linguistic term for this concept?

egberts1 21 hours ago

Now you are delving into the world of intonation, just like ASL can squeeze nearly 200 meanings out of a single sign or Navaho can utter a consonant too in hundreds of ways that befuddle even the best enemy codebreakers.

Spoke English is also the same.

Just watch a typical George Carlin video on how he stretches out a single word.

beepbooptheory 2 days ago

Reminds me of how the captions were done in Tony Scott's Man on Fire (2004). It's a pretty great movie too.