Tuesday, May 5, 2009

mp3的id3

什么是id3?
ID3 is a metadata container most often used in conjunction with the MP3 audio file format,It allows information such as the title, artist, album, track number, or other information about the file to be stored in the file itself.

上面是wikipedia的解释. 说白了就是mp3的描述信息段.

到目前, 有三个版本:
1. id3v1
2. id3v1.1
3. id3v2(到现在有三个版, 2.2, 2.3(广泛使用的版本)和2.4

id3v2是国际化, 可扩展的解决方案. 有部分与id3v1不兼容(it has little to no relation to the ID3v1 standard).

id3的位置, 在v1版本中, id3信息段是放在文件的最后的(The tag was placed at the end of the file to maintain compatibility with older media players.)
到了v2版本, id3信息段是放在前最前面的(ID3v2 tags are of variable size, and usually occur at the start of the file, to aid streaming media.)

网上流传的一段gbk转uft-8编码是这样做的:
def has_id3v1(filename):
f = open(filename, 'rb+')
try: f.seek(-128, 2)
except IOError: pass
else: return (f.read(3) == "TAG")


这应该是不考虑v1.1Enhanced tag的(v1.1是非官方的, 并且: is only supported by few programs, not including XMMS or Winamp)


id3v1是固定长度方式的, 到了idv2后, 是可以自定义的. 结构会复杂很多, 如下:


http://ample.sourceforge.net/developers.shtml
没事做, 自己写的读取id3信息的脚本:
#coding:utf-8
import os,sys
import array

song = open("/home/jessinio/jessinio/music/信乐队/死了都要爱.mp3")
header = song.read()
print "Header ID: " + header[:0x03]
print "Version: " + repr(header[0x03:0x05])
print "Flags: " + repr(header[0x05])
total_length = array.array("L")
total_length.fromstring(header[0x06:0x0a])
total_length.byteswap()
print "Size: " + str( total_length[0] )
song.seek(0,0)
header = song.read(total_length[0])
print header

def get_id(header, frame_header_id_s = 0x0a):
# init : 0x0a
# frame_header_id_s = 0x0a
frame_header_id_e = frame_header_id_s + 3
print "Frame Header ID: " + header[frame_header_id_s:frame_header_id_e + 1]

size_s = frame_header_id_e + 1
size_e = size_s + 3
length = array.array("L")
length.fromstring(header[size_s:size_e + 1])
length.byteswap()
print "Size: " + str(length[0])

frame_flags_s = size_e + 1
frame_flags_e = frame_flags_s + 1
print "Frame Flags: " + repr(header[frame_flags_s:frame_flags_e + 1])
frame_context_s = frame_flags_e + 1
print "Frame Context: prefix:%s, surfix:%s " % (repr(header[frame_context_s]), header[frame_context_s + 1:frame_context_s + length[0]].decode("gbk").encode("utf8"))
next = frame_context_s + length[0]
print "####: " + str( next )
return next

next_point = 0x0a
for i in range(15):
start_point = next_point
next_point = get_id(header, start_point)


上面上堆工夫做的事就是:
import mutagen
metadata = mutagen.id3.ID3("/home/jessinio/jessinio/music/信乐队/死了都要爱.mp3")

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.