advertisement
javaboutique
Search Tips
Articles  |   Tutorials  |   Reviews  |   Tools  |   by Category  |   by Date  |   by Name  |   Submit  |   Source  |   Forums  |  
javaboutique
Browse DevX


Partners & Affiliates











advertisement

Reviews : Davisor Offisor 1.5.1 :

Offisor XML: an improvement?

Using Offisor is easy, but the real question is, is it beneficial. This requires analyzing the XML format that is output and how easy or difficult it is to work with. To test this, I tried saving a Word doc as HTML (Word 2002 - just to test as it was undocumented), parsing it with Offisor and then comparing the results. What I discovered was that there was a significant difference on a number of levels. Take a look at the following files:

These four documents differ quite a bit. To start, the XML version is pretty, readable, intuitive, and clean. The transformation takes no time -- this is one lean application. All that is going on behind the scenes for the HTML rendition is to apply an eXtensible Stylesheet Language Transformations (XSLT) transformation. I was impressed with the ease of doing this.

Of the HTML documents, you will find that the Offisor HTML document has the least lines: 187 (note: Offisor also left a couple lines stating I was using a demo version). In comparison, the Word exported HTML was a whopping 579 lines and the Dreamweaver parsed HTML 264. This is with a simple clean up to remove empty spaces. The Dreamweaver example did retain some of the styling, which arguably could be nice in that documents look quite familiar after conversion. However, the styling includes font and other information not as style classes, but as defined elements. This could be a problem for content management systems that attempt to apply a singular style site-wide. Overall, I felt that Offisor shined. It also goes without saying that Offisor documents can be parsed dynamically and using XSLT, developers can transform that XML into whatever they like. Of course, using the Save As method and Dreamweaver parsing both require additional steps prior to uploading content. Overall, the XML output by Offisor is very intuitive to use and straightforward. A Word XML reference can be found at the Davisor site.

Of course, I didn't compare the actual proprietary Word doc format file. Why? Well, it's a binary format so it really isn't worth comparing. If this wasn't the case there wouldn't be any need for tools like Offisor, now would there?

How Offisor can be used

Where I think Offisor gets exciting is all of the potential applications. Converting documents from a popular and proprietary format, as well as basic HTML, into a universal one is undoubtedly a move in the right direction. The possibilities of what can be done are numerous:

  • E-mail Attachment Conversion - By dynamically converting Word attachments in e-mail, a number of advantages can be gained by elimination of virus threats, enabling simple in-browser HTML viewing of these attachments and viewing attachments on computers and devices that don't have Word installed.
  • Content Management Delivery - Clearly the most obvious: enabling everyone in the office to simply upload their Word documents, as well as HTML and other documents and be able to translate them into HTML, Wireless Markup Language (WML) or any other form of XML that can be read by any variety of devices and applications.
  • Universal Content Repository - storing all HTML and Word docs in a universal format - especially XML - can eliminate numerous data storage issues. Because XML is text-based as opposed to Word's binary format, documents can be stored in character format.
  • Superior Indexing - Once again, because XML is text-based as opposed to Word's binary format, documents can be indexed and users can perform keyword searches on the content.
  • Easy PDF Conversion - enable easy conversion, even on the fly, to PDF format. The examples even include a working version of this functionality. PDF's read-only capacity make it a superior format in some circumstances such as contracts and helping users that are concerned about Word viruses.

Conclusion

Quite frankly, I couldn't be more impressed with this tool. It is simple, cheap (300 Euros, about $350US, tool only, or 600 Euros, about US$700, including 1 year of support and upgrades),it does exactly what it says, converting binary Word docs and loosely-structured HTML docs into XML and it does it well. The documentation is comprehensive and the API is simple. I only look forward to being able to parse other Office documents, such as Excel and Powerpoint files. While undoubtedly performing these types of translations within the Microsoft development environment is probably accessible to developers, in Java/J2EE this has always been a challenge. I think that Offisor can help us all to deal with this pervasive issue.


Drew Falkman is the author of the JRun Web Application Construction Kit and co- author (with Ben Forta) of Reality ColdFusion: J2EE Integration, both published by Macromedia Press. Over the past 6 years, Drew has developed over 150 Web applications in all sizes using ColdFusion and Java. Currently Drew consults, speaks at events, writes for numerous publications, and teaches courses at Portland State University. His latest project through his consulting company, Veraison LLC, was a real-time cattle auction using Flash Remoting and Flash Communication Server. In addition, Drew is a member of Team Macromedia, a certified ColdFusion Developer and a certified Macromedia instructor.

How to Add Java Applets to Your Site

New on the Java Boutique:

New Review:

Time Management Made Easy with the Quartz Enterprise Job Scheduler
Why not just use the Java timer API? This open source scheduling API boasts simplicity, ease-of-integration, a well-rounded feature set, and it's free!

New Applet:

Reverse Complement
Reverse Complement is a simple applet that converts DNA or RNA sequences into three useful formats.

Elsewhere on internet.com:

WebDeveloper Java
Lots of Java information on webdeveloper.com

WDVL Java
Thorough Java resource at the Web Developer's Virtual Library.

ScriptSearch Java
Hundreds of free Java code files to download.

jGuru: Your View of the Java Universe
Customizable portal with online training, FAQs, regular news updates, and tutorials.

 Microsoft RIA Development Center
 IBM Rational Resource Center
 Destination .NET
XML error: not well-formed (invalid token) at line 33
advertisement
Receive Articles via our XML/RSS feed
Receive Articles via our XML/RSS feed

JavaBytes
Internet Cyclone
This powerful, easy-to-use, internet optimizer is for Windows 95, 98, ME, NT, 2000 and XP. It's designed to automatically optimize your Windows settings, boosting your Internet connection up to 200%.

Free VMware Server 2.0 Now Release Candidate
Linux Player Xandros Grabs Storied Rival Linspire
Hey Enterprise: Here Comes the 3G iPhone
MySpace Opens Profile Portability API
Microsoft Jumps Into Virtualization Fray
Eclipse Ganymede Makes It Easier for Devs
Open Source Nokia a Threat to Microsoft, Google?
Salesforce, Google Head for 2nd on Apps
HP Open Sources Unix File System for Linux
Red Hat Opens Its Network to Space

Build a Generic Histogram Generator for SQL Server
Beyond XML and JSON: YAML for Java Developers
Mastering the Windows Mobile Emulators
Avaya AE Services Provide Rapid Telephony Integration with Facebook
Featured Algorithm: Intel Threading Building Blocks: parallel_reduce
Getting Started with Windows Live Admin Center
Eight Key Practices for ASP.NET Deployment
Java ME User Interfaces: Do It with LWUIT!
Talking VPro: Transcript
Bringing Semantic Technology to the Enterprise

Advertising Info  |   Member Services  |   Contact Us  |   Help  |   Feedback  |   Site Map  |   Network Map  |   About



JupiterOnlineMedia

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

Solutions
Whitepapers and eBooks
IBM eBook: Planning a Service Oriented Architecture
IBM eBook: Choosing the Right Architecture--What It Means for You and Your Business
Microsoft Article: Will Hyper-V Make VMware This Decade's Netscape?
Avaya Article: Using Intelligent Presence to Create Smarter Business Applications
Intel Go Parallel Article: Getting Started with TBB on Windows
Microsoft Article: 7.0, Microsoft's Lucky Version?
Avaya Article: How to Feed Data into the Avaya Event Processor
IBM Article: Developing a Software Policy for Your Organization
Microsoft Article: Managing Virtual Machines with Microsoft System Center
Intel Go Parallel Article: Intel Threading Tools and OpenMP
HP eBook: Storage Networking , Part 1
Microsoft Article: Solving Data Center Complexity with Microsoft System Center Configuration Manager 2007
MORE WHITEPAPERS, EBOOKS, AND ARTICLES
Webcasts
HP Video: StorageWorks EVA4400 and Oracle
HP Webcast: Storage Is Changing Fast - Be Ready or Be Left Behind
Microsoft Silverlight Video: Creating Fading Controls with Expression Design and Expression Blend 2
MORE WEBCASTS, PODCASTS, AND VIDEOS
Downloads and eKits
Red Gate Download: SQL Toolbelt and free High-Performance SQL Code eBook
Iron Speed Designer Application Generator
MORE DOWNLOADS, EKITS, AND FREE TRIALS
Tutorials and Demos
Silverlight 2 App and Walkthrough: Leverage Silverlight 2 with SQL Server and XML
IBM Article: Enterprise Search--Do You Know What's Out There?
HP Demo: StorageWorks EVA4400
Microsoft Article: The Progress and Promise of Deep Zoom
Microsoft How-to Article: Get Going with Silverlight and Windows Live
MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES