程序代做CS代考 Excel Semester 2 2021

Semester 2 2021
Lecture 3, Part II: Data Formats – Semi-structured format: CSV, HTML, and XML

Semi-structured Data

CSV: comma separated values
• Tabular information, with extension .csv
• Structured, but not like excel or a relational DB
• Just a delimited text file, human readable.
• Lacks formatting information
• Does not contain formulas and macros for data verification, transformation

HTML – Hypertext Markup language
• Marked up with elements, correspond to logical units,
• a heading, paragraph or itemised list.
• defines that how web browser will format and display the content
• Elements marked by tags.
• Tags: keywords contained in pairs of angle brackets, not case sensitive • closed tags: content
• Unclosed tag:
• Elements can have attributes; ordering of attributes is not significant.

HTML Example
Try it yourself: https://www.w3schools.com/html/tryit.asp?filename=tryhtml5_browsers_myhero HTML examples: https://www.w3schools.com/html/html_lists.asp

Limitations of HTML
• HTML was designed for pure presentation
• HTML is concerned with formatting not meaning
it doesn’t matter what it is about, HTML will format it
• HTML is not extensible
• can’t be modified to meet specific domain knowledge
• browsers have developed their own tags (, )
• HTML can be inconsistently applied almost everything is rendered somehow e.g. is this acceptable?

XML: eXtensible Markup Language
• Extensible: user defined tags
• Facilitate better encoding of semantics

XML syntax – cont.
• Preserves white spaces. I think … therefore I am • some characters have special meaning
• ‘<’ and ‘&’ are strictly illegal inside an element • all books & videos are now < AUD 10
allbooks&videosarenow<AUD10
• CDATA (character data) section may be used inside XML element to include large blocks of text, which may contain these special characters such as &, >

XML applications
• A ‘meta’ mark-up language.
• Mathematical Markup Language (MathML)
• ChemML (Chemical Markup Language)
• FHIR (Health/Medical data: http://hl7.org/fhir) • RSS, SOAP, SVG, …

MathML example: markup an equation
In MathML, x3+6x+6 is represented as

x 3
+
6 &InvisibleTimes; x

+
6

XML vs HTML
• Extensible — non-extensible
• Case sensitive — not case sensitive • Focus on semantics — display