{"id":131,"date":"2006-04-13T21:49:01","date_gmt":"2006-04-13T21:49:01","guid":{"rendered":"http:\/\/dev.gregbueno.com\/wp\/sakufu\/2006\/04\/13\/loading_musicbr\/"},"modified":"2006-04-13T21:49:01","modified_gmt":"2006-04-13T21:49:01","slug":"loading_musicbr","status":"publish","type":"post","link":"https:\/\/gregbueno.com\/sakufu\/2006\/04\/13\/loading_musicbr\/","title":{"rendered":"Loading Musicbrainz into MySQL"},"content":{"rendered":"<p>For some unfathomable reason, I&#8217;m thinking about resurrecting the <a href=\"http:\/\/archive.musicwhore.org\/artists\/\">Musicwhore.org artist directory<\/a>. That thing was a pain in the butt to maintain, and it burned me out.<\/p>\n<p>So why bring it back? Well, this time around, I have no intention of maintaining it. That is, I&#8217;m not going to spend my spare time doing data entry.<\/p>\n<p>Rather, I want the directory to be powered by <a href=\"http:\/\/musicbrainz.org\/\">Musicbrainz<\/a>.<\/p>\n<p><!--more--><\/p>\n<p>I think some of my work is a tad more thorough than Musicbrainz, but they have breadth, depth and the technology to merge more easily with <a href=\"http:\/\/www.freedb.org\/\">freedb<\/a>. Since the content is open to the public domain, I can adapt what I&#8217;ve done with what the Musicbrainz community has done.<\/p>\n<p>And it&#8217;ll be easier to incorporate some of the new stuff I&#8217;m writing now on <a href=\"http:\/\/www.musicwhore.org\/\">Musicwhore.org proper<\/a>.<\/p>\n<hr size=\"1\" width=\"50%\">\n<p>There are, however, a number of obstacles.<\/p>\n<ul>\n<li> Musicbrainz data dumps are in PostgreSQL.\n<li> Musicwhore.org is powered by MySQL.\n<li> I develop at home on Windows, while Musicbrainz and the web host for Musicwhore.org run UNIX.<\/ul>\n<p>At first, I tried using some of the PERL bindings from the <a href=\"http:\/\/musicbrainz.org\/products\/client\/index.html\">client\/software development kit<\/a>. ActiveState doesn&#8217;t store any of the <a href=\"http:\/\/search.cpan.org\/search?query=musicbrainz&amp;mode=all\">Musicbrainz modules<\/a> listed over at <a href=\"http:\/\/search.cpan.org\/\">CPAN<\/a>, and the PPD files <a href=\"http:\/\/www.punch.net\/musicbrainz\/perl\/\">out there<\/a> keep giving me errors. So I had to compile those. Lucky for me I have Visual Studio 6 installed.<\/p>\n<p>After I went through all that drama, I ran across a module called <tt><a href=\"http:\/\/search.cpan.org\/author\/TOMI\/AudioFile-Identify-MusicBrainz-0.4\/lib\/AudioFile\/Identify\/MusicBrainz.pm\">AudioFile::Identify::Musicbrainz<\/a><\/tt>, which pretty much precluded the need to compile the existing Musicbrainz modules. D&#8217;oh.<\/p>\n<p>Unfortunately, <tt>AudioFile::Identify::Musicbrainz<\/tt> has a somewhat narrow purpose. I couldn&#8217;t use it as a web service.<\/p>\n<p>So I spent the first half of the day trying to set up PostgreSQL on Windows to browse a dump of the database. I hit a major wall with CVS &#8212; I used CVS very briefly at a job that won&#8217;t be named, but even then, I didn&#8217;t have to contend with WinCVS. I went so far as to install Cygwin (nice!) but at that point, I knew I didn&#8217;t want to have to go through all these steps setting up a PostgreSQL development environment at home.<\/p>\n<p>It was that point when I found a throwaway paragraph in the <a href=\"http:\/\/cvs.musicbrainz.org\/cvs\/mb_server\/INSTALL?rev=HEAD&amp;content-type=text\/vnd.viewcvs-markup\">installation instructions<\/a>. Evidently, it was possible to use three commands to load the PostgreSQL dump in MySQL. As I would discover, it wasn&#8217;t necessarily that easy.<\/p>\n<hr size=\"1\" width=\"50%\">\n<p>To wit, the commands are as follows:<\/p>\n<blockquote><p><tt>mysqladmin create musicbrainz<\/p>\n<p>mysql musicbrainz &lt; CreateTables.sql<\/p>\n<p>mysqlimport musicbrainz mbdump\/*<\/tt><\/p><\/blockquote>\n<p>In Windows, I used phpMyAdmin to create the musicbrainz database. When I tried to load <tt>&lt;a href=&quot;<a href=\"http:\/\/cvs.musicbrainz.org\/cvs\/mb_server\/admin\/sql\/CreateTables.sql?rev=HEAD&amp;content-type=text\/vnd.viewcvs-markup\">CreateTables.sql<\/a><\/tt> with the <tt>mysql<\/tt> command, I received a number of errors.<\/p>\n<p>Naturally, MySQL and PostgreSQL differ slightly in syntax, so I had to spend another hour or so searching the web for PostgreSQL-to-MySQL syntax conversion. There were a lot of links to migrate <em>from<\/em> MySQL, not <em>to<\/em> MySQL. I found <a href=\"http:\/\/www.lightbox.ca\/pg2mysql.php\">one site<\/a> that did a rough conversion, but there were still a few things to clean up manually (comments, some array-styled fields).<\/p>\n<p>I have an old version of MySQL installed on my computer at work, and it gave me hassles for trying to use <tt>mysqlimport<\/tt>. So I had to use phpMyAdmin again. Eventually, I managed to put a Musicbrainz dump on MySQL.<\/p>\n<p>So for posterity I&#8217;m going to list the steps I took to make this whole thing happen:<\/p>\n<ol>\n<li> <a href=\"http:\/\/musicbrainz.org\/products\/server\/download.html\">Download<\/a> a dump of the Musicbrainz database core.\n<li> Uncompress the dump with bunzip and tar. Uncompressed, the dump is about 690MB.\n<li> <a href=\"http:\/\/cvs.musicbrainz.org\/cvs\/mb_server\/admin\/sql\/CreateTables.sql?rev=HEAD&amp;content-type=text\/vnd.viewcvs-markup\">Download <\/a> <tt>CreateTables.sql<\/tt> from the Musicbrainz CVS server.\n<li> Edit <tt>CreateTables.sql<\/tt> to use MySQL syntax. You can sift it through a conversion script, edit it by hand or do both.\n<li> Remove any <tt>CREATE<\/tt> queries from <tt>CreateTables.sql<\/tt> that aren&#8217;t included in the dump. There are a number of tables included in the SQL file that require a separate dump. Don&#8217;t bother with them.\n<li> Load <tt>CreateTables.sql<\/tt>:<br \/>\n<blockquote><p><tt><\/p>\n<p>mysql musicbrainz &lt; CreateTables.sql<\/tt><\/p><\/blockquote>\n<li> If you can&#8217;t use <tt>mysqlimport<\/tt> for some reason or other, build another SQL file &#8212; let&#8217;s call it <tt>LoadMB.sql<\/tt> &#8212; with a series of separate <tt>LOAD INTO FILE<\/tt> queries for each dumped file. Each record is tab-delimted and each dumped file is named after its table.<br \/>\n<blockquote><p><tt>LOAD DATA LOCAL INFILE 'C:\\\\Path\\\\to\\\\mbdump\\\\album' REPLACE INTO TABLE `album` FIELDS TERMINATED BY '\\t' ESCAPED BY '\\\\' LINES TERMINATED BY '\\r\\n'<\/tt><\/p><\/blockquote>\n<li> Import the dump into MySQL:<br \/>\n<blockquote><p><tt>mysql musicbrainz &lt; LoadMB.sql<\/tt><\/p><\/blockquote>\n<\/ol>\n<p>Of course, remember to use the <tt>--host<\/tt>, <tt>--user<\/tt>, and <tt>--password<\/tt> options when you run <tt>mysqladmin<\/tt>, <tt>mysql<\/tt> and <tt>mysqlimport<\/tt>.<\/p>\n<p>After I got the Musicbrainz dump loaded into MySQL, I ran a few queries on it, edited some Musicwhore.org pages to connect to it &#8230; looks promising. But there&#8217;s still a lot of work to be done resolving my database structure with Musicbrainz.<\/p>\n<p>Still &#8230; it feels kind of neat knowing I could perhaps bring back the artist directory in a manner far more powerful than what it originally was.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For posterity I&#8217;m going to list the steps I took to load a PostgreSQL dump of Musicbrainz into MySQL on Windows.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[6],"tags":[],"class_list":["post-131","post","type-post","status-publish","format-standard","hentry","category-technophilia-professional"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p4Bkjq-27","_links":{"self":[{"href":"https:\/\/gregbueno.com\/sakufu\/wp-json\/wp\/v2\/posts\/131","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/gregbueno.com\/sakufu\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/gregbueno.com\/sakufu\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/gregbueno.com\/sakufu\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/gregbueno.com\/sakufu\/wp-json\/wp\/v2\/comments?post=131"}],"version-history":[{"count":0,"href":"https:\/\/gregbueno.com\/sakufu\/wp-json\/wp\/v2\/posts\/131\/revisions"}],"wp:attachment":[{"href":"https:\/\/gregbueno.com\/sakufu\/wp-json\/wp\/v2\/media?parent=131"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/gregbueno.com\/sakufu\/wp-json\/wp\/v2\/categories?post=131"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/gregbueno.com\/sakufu\/wp-json\/wp\/v2\/tags?post=131"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}