2008年4月4日金曜日

[JAVA>Rome, RSS/XML] When using Rome RSS Reader, ParsingFeedException(Invalid XML) error occured.

References
  1. Rome WebSite
    https://rome.dev.java.net/
Problems
When using Rome RSS Reader, "com.sun.syndication.io.ParsingFeedException: Invalid XML" error sometimes occures.This problem occure while parsing the RSS feed and System raises the ParsingFeedException and NumberFormatException.

Error Stack when occuring this problem
com.sun.syndication.io.ParsingFeedException: Invalid XML
com.sun.syndication.io.ParsingFeedException: Invalid XML
at com.sun.syndication.io.WireFeedInput.build(WireFeedInput.java:185)
at com.sun.syndication.io.SyndFeedInput.build(SyndFeedInput.java:122)
at RomeTest.main(RomeTest.java:35)
Caused by: java.lang.NumberFormatException: For input string: "09:31"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:63)
at java.lang.Long.parseLong(Long.java:427)
at java.lang.Long.parseLong(Long.java:476)
at com.sun.syndication.io.impl.RSS092Parser.parseItem(RSS092Parser.java:107)
at com.sun.syndication.io.impl.RSS093Parser.parseItem(RSS093Parser.java:39)
at com.sun.syndication.io.impl.RSS094Parser.parseItem(RSS094Parser.java:68)
at com.sun.syndication.io.impl.RSS090Parser.parseItems(RSS090Parser.java:263)
at com.sun.syndication.io.impl.RSS090Parser.parseChannel(RSS090Parser.java:178)
at com.sun.syndication.io.impl.RSS091UserlandParser.parseChannel(RSS091UserlandParser.java:84)
at com.sun.syndication.io.impl.RSS092Parser.parseChannel(RSS092Parser.java:49)
at com.sun.syndication.io.impl.RSS094Parser.parseChannel(RSS094Parser.java:45)
at com.sun.syndication.io.impl.RSS090Parser.parse(RSS090Parser.java:82)
at com.sun.syndication.io.WireFeedInput.build(WireFeedInput.java:252)
at com.sun.syndication.io.WireFeedInput.build(WireFeedInput.java:179)
Reason why this errr occured
Some RSS data has a enclosure tag in channel/item element as follows and length parameter is defined in enclosure element.
 enclosure url="http://domain.com/file.mp3" length="123456789" type="audio/mpeg"
According to the wikipedia(http://en.wikipedia.org/wiki/RSS_Enclosures) or other web site(http://www.w3schools.com/rss/rss_tag_enclosure.asp), the length parameter means the length (in bytes) of the media file and this parameter is required field. The length parameter should be numerical value, but sometimes it isn't numerical in some RSS feeds.(In some feeds, the value of length parameter includes the comma separator like "123,456,789", or other RSS includes the "09:31" value similar to the timestamp)
In this case, while System tries to convert the length parameter from String object to Long object by using java.lang.Long.parseLong method, System throws the NumberFormatException because the value in the length field has invalid format value.

Solution
  1. Fixed file
    rome-0.9/src/java/com/sun/syndication/io/impl/RSS092Parser.java
  2. Fixed function
    Change as follows in parseItem() function.
  3. Description
    catches the exception while converting the String object to Long for the Length parameter.
    In case of my fix, catches the NumberFormatException and then, tries to remove the comma characters from the parameter and retry the conversion. retry is failed, ignore occuring this exception and doesn't set the value inf Length attribute.
  4. Fixed file
    Jar File: http://groups.google.com/group/taapps-sourcecode-libraries/web/rome-0.9-tafix.jar
    This jar file includes both this fix and http://taapps-javalibs.blogspot.com/2007/06/javarome-rss-when-using-rome-rss-reader.html problem.
  5. Fix Example
        att = e.getAttributeValue("length");//getRSSNamespace()); DONT KNOW WHY DOESN'T WORK
    if (att!=null && att.trim().length()>0) {
    //Starting customization, added by tatsuya anno(taapps@gmail.com)
    //sometimes occures com.sun.syndication.io.ParsingFeedException
    //when converting the attribute value to the Long
    String trimmedAtt=att.trim();
    long enclosureLength=-1;
    try{
    enclosureLength=Long.parseLong(trimmedAtt);
    }catch(Exception exception){
    try{
    //remove commma character before converting
    //string to long.
    trimmedAtt=trimmedAtt.replaceAll(",","");

    //retry conversion from string to long
    //if exception occures, system will ignore it
    //in this case
    enclosureLength=Long.parseLong(trimmedAtt);
    }catch(Exception ignore){}
    }

    //if enclosureLength is greater than 0, set length
    if(enclosureLength>=0){
    enclosure.setLength(enclosureLength);
    }
    //end of customization
    }

0 件のコメント: