advertisement
javaboutique
Search Tips
Articles  |   Tutorials  |   Reviews  |   Tools  |   by Category  |   by Date  |   by Name  |   Submit  |   Source  |   Forums  |  
javaboutique
Browse DevX


Partners & Affiliates











advertisement

Reviews : Davisor Offisor 1.5.1 :

Offisor XML: an improvement?

Using Offisor is easy, but the real question is, is it beneficial. This requires analyzing the XML format that is output and how easy or difficult it is to work with. To test this, I tried saving a Word doc as HTML (Word 2002 - just to test as it was undocumented), parsing it with Offisor and then comparing the results. What I discovered was that there was a significant difference on a number of levels. Take a look at the following files:

These four documents differ quite a bit. To start, the XML version is pretty, readable, intuitive, and clean. The transformation takes no time -- this is one lean application. All that is going on behind the scenes for the HTML rendition is to apply an eXtensible Stylesheet Language Transformations (XSLT) transformation. I was impressed with the ease of doing this.

Of the HTML documents, you will find that the Offisor HTML document has the least lines: 187 (note: Offisor also left a couple lines stating I was using a demo version). In comparison, the Word exported HTML was a whopping 579 lines and the Dreamweaver parsed HTML 264. This is with a simple clean up to remove empty spaces. The Dreamweaver example did retain some of the styling, which arguably could be nice in that documents look quite familiar after conversion. However, the styling includes font and other information not as style classes, but as defined elements. This could be a problem for content management systems that attempt to apply a singular style site-wide. Overall, I felt that Offisor shined. It also goes without saying that Offisor documents can be parsed dynamically and using XSLT, developers can transform that XML into whatever they like. Of course, using the Save As method and Dreamweaver parsing both require additional steps prior to uploading content. Overall, the XML output by Offisor is very intuitive to use and straightforward. A Word XML reference can be found at the Davisor site.

Of course, I didn't compare the actual proprietary Word doc format file. Why? Well, it's a binary format so it really isn't worth comparing. If this wasn't the case there wouldn't be any need for tools like Offisor, now would there?

How Offisor can be used

Where I think Offisor gets exciting is all of the potential applications. Converting documents from a popular and proprietary format, as well as basic HTML, into a universal one is undoubtedly a move in the right direction. The possibilities of what can be done are numerous:

  • E-mail Attachment Conversion - By dynamically converting Word attachments in e-mail, a number of advantages can be gained by elimination of virus threats, enabling simple in-browser HTML viewing of these attachments and viewing attachments on computers and devices that don't have Word installed.
  • Content Management Delivery - Clearly the most obvious: enabling everyone in the office to simply upload their Word documents, as well as HTML and other documents and be able to translate them into HTML, Wireless Markup Language (WML) or any other form of XML that can be read by any variety of devices and applications.
  • Universal Content Repository - storing all HTML and Word docs in a universal format - especially XML - can eliminate numerous data storage issues. Because XML is text-based as opposed to Word's binary format, documents can be stored in character format.
  • Superior Indexing - Once again, because XML is text-based as opposed to Word's binary format, documents can be indexed and users can perform keyword searches on the content.
  • Easy PDF Conversion - enable easy conversion, even on the fly, to PDF format. The examples even include a working version of this functionality. PDF's read-only capacity make it a superior format in some circumstances such as contracts and helping users that are concerned about Word viruses.

Conclusion

Quite frankly, I couldn't be more impressed with this tool. It is simple, cheap (300 Euros, about $350US, tool only, or 600 Euros, about US$700, including 1 year of support and upgrades),it does exactly what it says, converting binary Word docs and loosely-structured HTML docs into XML and it does it well. The documentation is comprehensive and the API is simple. I only look forward to being able to parse other Office documents, such as Excel and Powerpoint files. While undoubtedly performing these types of translations within the Microsoft development environment is probably accessible to developers, in Java/J2EE this has always been a challenge. I think that Offisor can help us all to deal with this pervasive issue.


Drew Falkman is the author of the JRun Web Application Construction Kit and co- author (with Ben Forta) of Reality ColdFusion: J2EE Integration, both published by Macromedia Press. Over the past 6 years, Drew has developed over 150 Web applications in all sizes using ColdFusion and Java. Currently Drew consults, speaks at events, writes for numerous publications, and teaches courses at Portland State University. His latest project through his consulting company, Veraison LLC, was a real-time cattle auction using Flash Remoting and Flash Communication Server. In addition, Drew is a member of Team Macromedia, a certified ColdFusion Developer and a certified Macromedia instructor.

How to Add Java Applets to Your Site

New on the Java Boutique:

New Review:

Time Management Made Easy with the Quartz Enterprise Job Scheduler
Why not just use the Java timer API? This open source scheduling API boasts simplicity, ease-of-integration, a well-rounded feature set, and it's free!

New Applet:

Reverse Complement
Reverse Complement is a simple applet that converts DNA or RNA sequences into three useful formats.

Elsewhere on internet.com:

WebDeveloper Java
Lots of Java information on webdeveloper.com

WDVL Java
Thorough Java resource at the Web Developer's Virtual Library.

ScriptSearch Java
Hundreds of free Java code files to download.

jGuru: Your View of the Java Universe
Customizable portal with online training, FAQs, regular news updates, and tutorials.

 Microsoft Visual Studio 2010 Showcase
 Avaya Developer Showcase
 MSDN Spotlight
 PHP for Windows Showcase
XML error: undefined entity at line 39
advertisement
Receive Articles via our XML/RSS feed
Receive Articles via our XML/RSS feed

JavaBytes
Internet Cyclone
This powerful, easy-to-use, internet optimizer is for Windows 95, 98, ME, NT, 2000 and XP. It's designed to automatically optimize your Windows settings, boosting your Internet connection up to 200%.

Windows 7: From Beta to Final Code in One Year
Google Shows Off Chrome OS, Releases Source
Microsoft Shows Off Silverlight 4, IE9 Plans
Metasploit Expands Vulnerability Test Framework
HyperCard Reborn?
Fedora 12 Takes Aim at Linux Networking
Top Supercomputer Nearly Doubles in Speed
Fedora 12 Linux Tackles Virtualization
Apple Gives iPhone Developers App Status Tracker
Novell Sets OpenSUSE 11.2 Free

Creating Custom Export Filters for StarOffice with XSLT
WPF Wonders: Using DataTemplates
Crystal Reports Family Offers Options for Developers
Avaya Aura Session Manager video
Avaya Aura Overview video
Exploring HTML 5's Audio/Video Multimedia Support
Overriding Virtual Functions? Use C++0x Attributes to Avoid Bugs.
Understanding the Cloud Computing Security Vulnerabilities
Cisco and IBM Target a Greener World
Upgrade to Visual Studio 2010 with the Ultimate Offer

Advertising Info  |   Member Services  |   Contact Us  |   Help  |   Feedback  |   Site Map  |   Network Map  |   About

internet.commediabistro.comJusttechjobs.comGraphics.com

Search:

WebMediaBrands Corporate Info

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | Shopping | E-mail Offers | Freelance Jobs