• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
Michele Neylon :: Pensieri

Michele Neylon :: Pensieri

Technology, Marketing, Domains, Thoughts

XML vs CSV

February 20, 2008 by Michele Neylon 9 Comments

I’m currently working on a new project in the evenings and weekends that involves playing with merchant datafeeds.

The big lesson I’ve learnt in the last 48 hours is that XML is a pig to work with.

The data from merchant X weighs in at 84 megs as a CSV. If you grab the same data as XML you end up with a massive 261 megs. Sure, hard drives are cheap, but the server load goes through the roof when it has to process the larger XML file…

Moral of the story – stick to CSV

Related Posts:

  • privacy-keyboard-keys
    My Privacy Has Been Sold
  • Michele-Fest
    If my Music Taste was a Festival ...
  • Instafest-mneylon-2023
    Another Year of Listening to Music Visualised (sort of)
  • 369584561_1455658008621564_5818380236101618364_n
    Jam and Chutney Season
  • taylor-swift-eras-tour-paris
    Female Rage : The Musical (Taylor Swift in Paris)
  • 454730087_18450395806054922_6321267314986160616_n
    Adding to the Concert Roster

Filed Under: Techie :: Techno ::

Michele is founder and CEO of Irish hosting provider and domain name registrar Blacknight. Read More…

Reader Interactions

Comments

  1. hostyle says

    February 20, 2008 at 9:25 am

    Not sure of original author, but to quote someone on the internet: “XML is like violence: if a little doesn’t solve the problem, use more.”

  2. Dominykas says

    February 20, 2008 at 7:26 pm

    Um. I have to completely disagree with you. The problem is that CSV is not a worldwide standard as of yet – Germans use “,” (comma) as a decimal separator, thus their Excel exports (and imports) stuff using “;” (semicolon) as a “value separator” – rather than the English/American “.” (dot) for decimal and “,” comma for values. Consider the fact, that other countries have even more decimal separator symbols – I haven’t done my research fully there, but I’d suppose there might a problem or two elsewhere. Sure – CSV is quick’n’easy’n’dirty, but XML gives you the real thing. That of course does not apply to “local market” products.

  3. Michele Neylon says

    February 20, 2008 at 8:13 pm

    @Dominykas – the software I’m using can handle multiple formats, but the HUGE XML files are not making it happy

  4. hostyle says

    February 21, 2008 at 8:55 am

    Dominykas: how is that a problem for csv ?
    US/UK: “12,345.678”,”whatever”,”blah”
    European: “12.345,678”,”whatever”,”blah”

  5. Hugh says

    February 21, 2008 at 7:19 pm

    Michele,
    Some prefer XML for constantly updated items like news feeds etc, as it’s easier to figure out and parse small chunks of it.
    I’m working on a price comparison site at the moment, and we’ll be pulling in 100’s of datafeeds from loads of merchants. For this it makes sense for us to use CSV – we download each feed once a day using cron, and run a shell script once a day to unzip and import each feed into the database. Currently it takes about 2 hours for a full update, but based on what i’ve seen testing xml feeds, it’d take at least double that using xml.
    CSV – simple and effective.

  6. Michele Neylon says

    February 21, 2008 at 8:27 pm

    Hugh
    You’re in the same boat as me so 🙂
    Michele

  7. Ken Stanley says

    February 24, 2008 at 6:09 pm

    CSV is great for keeping file sizes down if there’s a linear pattern to the content, like a SQL dump. As Dominykas said, it’s not a standard and if the data needs to be portable, this can cause problems. XML is great for storing scalable pattern, non-linear data and has its uses too – but the nature of it, where each piece of data is marked-up/tagged means that it’s seriously bloated. XML and CSV are very different in my opinion. I’ll rarely use XML where CSV will suffice.

  8. Tom Gleeson says

    February 24, 2008 at 7:26 pm

    I once wrote a post about the great data lingua franca debate (http://blog.gobansaor.com/2007/03/03/tables-vs-xml-the-data-lingua-franca-debate/)
    but of course there was no debate, at least then, good to see others appreciating the “power” of the humble CSV table 😉
    Tom

  9. Michele Neylon says

    February 24, 2008 at 9:47 pm

    @Ken – the data I’m working with is provided by various merchants. Using the XML version simply adds bloated files with an expensive processing overhead. The CSV files are relatively light by comparison
    @Tom – The right tool for the job 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

dotblog founder

Mastodon

Archives

  • Bluesky
  • Instagram
  • Threads
  • Twitter

Pages

  • About Me
  • About This Site
  • Archives
  • Comments Policy
  • Contact Me
  • Newsletter
  • Privacy

Blogroll

  • Paul Savage – BlackDog SEO
  • My Mastodon
  • Gordon Hudson
  • Stewart Curry
  • Gianni Ponzi

Stalking Links

Subscribe to Michele Neylon :: Pensieri

Blogroll

  • Blacknight Blog
  • Damien Mulley
  • Gianni Ponzi
  • Gordon Hudson
  • Grandad
  • My Mastodon
  • Paul Savage – BlackDog SEO
  • Stewart Curry
  • Technology.ie Podcast
  • Tom Doyle

Sites

  • Business Travel Tips
  • Discount Coupon Codes
  • Domain News
  • Fat.ie – my diet blog
  • Film Posters
  • Film Reviews
  • Films
  • Free Desktop Wallpapers
  • Irish Blogger Discussion Forum
  • Irish Stamps
  • Movie Trailers and News
  • Paste.ie

Footer

Site hosted in Ireland by Blacknight - Content copyright Michele Neylon

Copyright © 2025 · Magazine Pro on Genesis Framework · WordPress · Log in