R E P O R T
"Parsing XML with DOM"
Conception
Having already worked with XML several times, I thought that it would be better
to include some of my real projects I have done during my assistanat at TECFA.
So, the most important cases I give here are :
- Formation Continue, here I had already done the DTD and the XML file is complete, but I parsed the file with Java & Xerces for the needs of this exercise.
- Project staf18 (promotion Fanny), this is a real work that I completely did during my work at TECFA and that will change and improved by the end of June, as the project evolves with time.
TOP
Results
Introduction page:
I am presenting the DTD's of Formation Continue and project staf18.
Actually, for staf18 I made many DTD's that all together form a complete one
and can be combined in different ways also.
I made some small changes in the DTD of Formation Continue, like adding
the id's for the courses and some other minor ones.
XML files:
For the formation Continue, I give here the original file that have been
filled in by the users (with my help for some of them). For staf18, I use the xml
files from the students of staf-f.
Servlets:
For the formation continue, I parse the xml file and I create a contents table
for all the existing courses and their modules.
For the staf18 project, I parse 2 files from each student's directory and
I extract useful information such as :
info for the files, group info (names, url's, etc), titles, etc. that
are helpful for the monitoring of the project by the professor.
TOP
Technical details - problems
As far as it concerns the DTD's, I found the conception quite easy.
Of course there was always the precious guidance of DKS who with his
experience pointed out the mistakes that I was doing, mostly concerning
the semantics aspect.
For the filling in of the XML file, I used xml-mode of Emacs that works
quite well, although not perfectly :)
For the parsing of the xml files I used the DOM parser of Xerces. I know that
it is slower than SAX, but having not much time to invest to this also, I
decided to do it that way, that seemed to me easier for the time being and
leave SAX for the near future. Of course, for the "catching" of the exceptions
I used SAX that is more precise in most of the cases.
The things that I am proud of (or ashamed of :)
- DTD inclusions:
I have split the DTD into several ones, to manage the project by phases,
so each DTD is being made by several included DTD's.
- validation:
catching the exceptions with SAX for the staf18 project,
I can notify the students that their file is not valid.
- educative tools:
both projects are being used for courses at TECFA.
- DOM parsing:
it took me some time to understand how it works though,
and I crashed several times the TECFA2 server :)
- going through many directories:
In order to parse info.xml & specification.xml files from each
student directory, I used the File class and it's methods.
TOP
References - Bibliography - Sites
TOP