The simplest way to convert analog audio into digital is PCM (Pulse Code Modulation) that samples the analog signals with the sampling frequency. The Nyquist-Shannon theorem says this sampling rate must be double of the highest expectable frequency. Those sampled pulses can then be converted in digital values using a analog to digital converter. Telephony voice got sampled at 8kHz with 8 bit, resulting in 64kbit/s
Looking at this approach shows much possibilities to reduce or compress the data flow.
A first approach was ADPCM (Adaptive Differential Pulse Code) that makes the hight of the steps of the AD converter depending on the previous signal tendency. This way voice usually sampled with 8 bit could be reduced to 3 bits without noticeable quality loss. The sampling rate for voice could be reduced to 8kHz * 3 bit= 24kbit/s
The next step in data compression made use of psychoacoustics. The human brain picks dominant frequencies and ignores adjacent frequencies with a lower amplitude. Using such approaches sampling rates for voice could be reached in the area of 4.8kbit/s. It should be noted that those formates loose data in the original signal, however the data lost is considered not be noticed by the human brain.
Most modern audio codecs make use of psychoacoustics. Such codecs are mp3.
Mp3 has also the possibility to add meta data (ID3 tag) next to the sound data. This data is often shown by media players and holds things as a song title. To see and edit this data programs as mp3info or using the gui gmp3info from http://ibiblio.org/mp3info/ easytag from https://wiki.gnome.org/Apps/EasyTAG or id3ed can be used.