Zachary Berry

Designer & Developer

Automatically Ripping FLAC Files to mp3s Via Shell Script

Automatically Ripping FLAC Files to mp3s Via Shell Script

Disclaimer: These instructions are for Ubuntu (or other Linux distro).

Being somewhat of an audiophile, I’ve been encoding my music in lossless FLAC format for a while now - which is great, except when you want to put them on an iPod (which of course, won’t play FLAC files). One option would be to convert these files to Apple Lossless, however, I don’t mind sacrificing some quality to be able to put more music on my iPod. Plus, I probably couldn’t tell the difference while I’m listening to the iPod anyway.

Therefore, I wanted a shell script that I could run that would find out which files I had recently added to my collection and convert them to mp3s (preserving metadata). Then, I could grab those mp3s and put them into gtkpod or iTunes.

I created the shell script below borrowing liberally from other similar scripts I found online. The key thing about this script is that it keeps a simple flat text file database so that it can remember which files it has already converted.

The biggest problem I stumbled on was handling genres. For whatever reason, many command line and GUI tools refuse to let you write in your own genres - you pick from a pre-defined list. This boggles my mind - it seems presumptuous to me that some group would be tasked with coming up with all of the possible labels to all music. I listen to a lot of Electronic music, which alone covers ‘Drum and Bass’, ‘House’, ‘Techno’, ‘Electro’, ‘Progressive Trance’, ‘Microhouse’ and a trillion other tiny genres. In addition to sub-genres, you also have the problem of your cousins band which plays an Acoustic Goth-Country “experience” that defies categorization, and cannot simply be lumped into ‘Alternative’.

I found two solutions to this problem. For ripping from CDs to FLAC, I use EasyTag which will let you enter in genres by hand. EasyTag’s not my favorite program in the world, but it gets the job done and you can get it from your package manager. For decoding the FLAC files and encoding them to mp3s, I found id3 mass tagger by squell. My script below uses this program (simply called id3), so you’ll need to install that first.

In addition to the id3 mass tagger, you’ll need to have lame (for encoding mp3s), flac (for decoding flacs) and metaflac (for handling the flac metadata). On linux, all of these should be obtainable through your package manager.

Next, grab this script and save it somewhere that is in your path (for example, /usr/bin). Personally, I like making a bin folder in my home directory to keep all of my scripts. You do whatever tickles your fancy. Make sure to give the script executable permissions:

chmod 755 name-of-the-script

The script:


#flac-to-mp3 will search FLAC_SOURCE_DIR for all flac files added since last time you've run flac-to-mp3. These files will be converted to mp3s and placed in MP3_OUTPUT_DIR. This script can be useful when maintaining a flac music collection along with a mp3 player like the iPod.

#Dependencies: lame, flac, metaflac, ID3 mass tagger (id3)

#Directory settings:  

#You probably don't need to set these. These specify the files that keep track of changes:  

IFS=" ";

if [ ! -d "$DB_DIR" ]; then  
mkdir "$DB_DIR"  

if [ ! -f "$CURRENT" ]; then  
touch "$CURRENT"  

if [ ! -f "$INDEXED" ]; then  
touch "$INDEXED"  

#Update index of flac files  
find "$FLAC_SOURCE_DIR" -type f -name "*flac" > "$CURRENT"

#Get a list of files that have been newly added  
if [ -f "$DIFF" ]; then  
rm "$DIFF"  

#Sort the database files so the diff won't have problems with files being in a different order:  
sort "$INDEXED" > "$TMP_FILE"  
cat "$TMP_FILE" > "$INDEXED"  
sort "$CURRENT" > "$TMP_FILE"  
cat "$TMP_FILE" > "$CURRENT"

touch "$DIFF"  
diff "$INDEXED" "$CURRENT" | grep ">" | cut -c 3- > "$DIFF"

#Reset IFS so that our for loop won't die on spaces  

#Encode the flac files that are new into mp3s into the MP3_OUTPUT_DIR directory.  
for FLAC in \`cat "$DIFF"\`  
#We first have to detect for duplicates. If an mp3 file of the same name exists in the MP3_OUTPUT_DIR then we should append (2), (3), and so on to the filename. Otherwise mp3 files would be overwritten. This situation arises when there are two or more flac files with the same name.  
echo "${FLAC%.flac}" | sed 's/.*\///' > "$TMP_FILE"  
TMP=\`cat "$TMP_FILE"\`  
NUM_FILES=\`ls "$MP3_OUTPUT_DIR" | sed 's/.mp3//' | grep "$TMP" | wc -l\`

if [ "$NUM_FILES" -gt 0 ]; then  
V=\`expr $NUM_FILES + 1\`  
MP3_PATH="${FLAC%.flac} ($V).mp3″  

#Encode flac file to mp3 file.  
echo "$MP3_PATH" | sed s/^.*\\///g > "$TMP_FILE"  
MP3=\`cat "$TMP_FILE"\`

[ -r "$FLAC" ] || { echo can not read file \"$FLAC\" >&1 ; exit 1 ; } ;  
metaflac -export-tags-to=- "$FLAC" | sed 's/=\(.*\)/="\1″/' >"$TMP_FILE"  
cat "$TMP_FILE" > /dev/null 2>&1  
. "$TMP_FILE"  
flac -dc "$FLAC" | lame -preset standard -tt "$TITLE" \  
-tn "$TRACKNUMBER" \  
-ty "$DATE" \  
-tc "$COMMENT" \  
-ta "$ARTIST" \  
-tl "$ALBUM" \  
-add-id3v2 \  
- "$MP3_OUTPUT_DIR"/"$MP3″

#We use ID3 mass tagger to write the genre tag - this will allow us to write non-standard genres. ID3 can be found at  
id3 -2 -g "$GENRE" "$MP3_OUTPUT_DIR"/"$MP3#Update the database:  
echo "$FLAC" >> "$INDEXED"  

rm "$DIFF"  
rm "$TMP_FILE"

IFS=" "  

Note that you’ll need to edit a few things, namely, the first three lines. FLAC_SOURCE_DIR should point to where you store your flac files. MP3_OUTPUT_DIR should point to a folder where you want the encoded mp3 files will end up. DB_DIR will be the folder that will hold the flat file databases. My script has a folder (.flac-to-mp3) in my home directory.

You may also want to modify the encoding options. Line 72 uses the -preset standard option - you might want to change this.

Outside of that, you probably don’t need to modify the script much. Read on if you want to know more about the inner workings of the script.

The script keeps two flat text database files - indexed and current. indexed lists all of the files that you have already encoded. When you run the script, current is rebuilt which is simply a list of all of the flac files in your FLAC_SOURCE_DIR. These two files are sorted, then a diff is performed to see which files exist in current that are missing from indexed. These are the files that (assumed) you have added since you last ran the script.

Next, a check is performed to make sure that a file of the same name doesn’t already exist in the MP3_OUTPUT_DIR. If it does, then a (2), (3) and so on are appended to the file name. After that, the script exports the flac metadata into a temporary file. Then, it decodes the flac file (line 72: flac -dc) and pipes that into lame. Next, id3 (the id3 mass tagger application) writes the id3 tag information into the newly created mp3. If all of these steps work, then the indexed file is appended with the name of the file (since it has already been encoded).

One final mega disclaimer: You’re at your own risk. I wrote this for my personal use and I haven’t tested it thoroughly, so backup your files or run the script on a small sample to make sure it works first!

comments powered by Disqus