XML Tutorial for Beginners: Learn XML Step by Step

By Softlookup Editorial Team · Updated April 25, 2026 · 12 min read · Free 13-chapter course

New to XML? Good news — it's one of the easiest technologies to learn. If you can read HTML, you can read XML in 5 minutes. This free 13-chapter tutorial takes you from "what's XML" to writing schemas (DTD and XSD), with a working XML validator on this page so you can practice without installing anything.
13
free chapters
~8 hrs
total study time
1998
XML 1.0 released
$0
cost (built into every OS)

What Is XML?

XML (eXtensible Markup Language) is a text format for storing and exchanging structured data. Unlike HTML — which has fixed tags like <p>, <div>, <h1> for displaying web pages — XML lets you invent your own tags to describe what your data means:

<customer>
  <name>Alice Chen</name>
  <email>alice@example.com</email>
  <city>Paris</city>
</customer>

That's a complete XML document. Three rules and you've already understood the format:

  1. Every opening tag <name> needs a closing tag </name>
  2. Tags are nested but never overlap
  3. The whole document has one root element wrapping everything

Where XML Is Still Used in 2026

JSON has replaced XML in most modern web APIs, but XML is far from dead. Anywhere you find one of these, you'll find XML:

Use CaseWhat You'll See
Configuration filesSpring (Java), Maven pom.xml, Android AndroidManifest.xml, .NET app configs
Office documentsMicrosoft Word .docx, Excel .xlsx, PowerPoint .pptx — they're all ZIP files of XML inside
Web feedsRSS, Atom — every podcast feed and most blog feeds are XML
SitemapsEvery sitemap.xml on every website you've ever submitted to Google
SVG graphicsScalable Vector Graphics — used everywhere from icons to charts — is XML
SOAP web servicesBanking, healthcare, government, and B2B integrations
Government & enterprise dataTax filings, healthcare records (HL7), legal documents (LegalXML)
Digital publishingEPUB ebooks are ZIPs of XML; print publishing uses DocBook, DITA

Try It Yourself: XML Validator

Below is a working XML validator that runs entirely in your browser. Paste any XML to check if it's well-formed (follows basic syntax rules). Try the example, or break it on purpose to see how errors are reported.

🛠️ XML Well-Formedness Validator
Runs in your browser — no upload
No data leaves your browser

The 5 Core Rules of XML

Rule 1: Every Opening Tag Needs a Closing Tag

<!-- Wrong: -->
<name>Alice

<!-- Right: -->
<name>Alice</name>

<!-- Empty element shortcut: -->
<br/>

Rule 2: Tags Must Be Properly Nested

<!-- Wrong: -->
<b><i>Bold and italic</b></i>

<!-- Right: -->
<b><i>Bold and italic</i></b>

Rule 3: One Root Element

Every XML document has exactly one outermost element that wraps all others:

<?xml version="1.0" encoding="UTF-8"?>
<catalog>             <!-- root element -->
  <product>...</product>
  <product>...</product>
</catalog>

Rule 4: Attribute Values Must Be Quoted

<!-- Wrong: -->
<book category=programming>

<!-- Right (single or double quotes both work): -->
<book category="programming">
<book category='programming'>

Rule 5: XML Is Case-Sensitive

<!-- These are DIFFERENT elements: -->
<Name>Alice</Name>
<name>Alice</name>
<NAME>Alice</NAME>

<!-- And THIS is a syntax error (mismatched case): -->
<Name>Alice</name>

XML vs HTML: The Key Differences

FeatureHTMLXML
PurposeDisplay web pagesStore and exchange data
TagsPredefined (<p>, <div>, etc.)You invent them
Case sensitivityTolerates eitherStrict — case matters
Closing tagsSome optionalAlways required
Attribute quotesRecommendedRequired
Browser displayRenders visuallyShows tree structure

XML vs JSON: When to Use Which

The honest answer: it depends on the system you're working with.

Use CaseBetter Choice
Modern REST APIJSON
JavaScript/web frontend dataJSON
SOAP web serviceXML (required)
Configuration files (Spring, Maven, Android)XML (often required)
RSS/Atom feedsXML (standardized)
SVG graphicsXML (it IS XML)
Document with mixed text and markupXML
Pure data, no document structureJSON
Document validation against a schemaXML (better tooling)
Government/healthcare/banking integrationsXML (often mandated)
Quick conversion example. The same data in both formats:
<!-- XML -->
<customer>
  <name>Alice</name>
  <age>30</age>
</customer>

// JSON
{
  "customer": {
    "name": "Alice",
    "age": 30
  }
}

Attributes vs Elements: When to Use Which

You can model the same data two ways:

<!-- Using attributes: -->
<book title="Learning XML" author="Erik Ray" year="2003"/>

<!-- Using child elements: -->
<book>
  <title>Learning XML</title>
  <author>Erik Ray</author>
  <year>2003</year>
</book>

Use attributes for metadata about an element (id, type, lang, version) and for simple values that won't have substructure.

Use child elements for the main content, anything that might contain other elements, or anything that might repeat.

Well-Formed vs Valid XML

Two different concepts that often confuse beginners:

Example: The XML below is well-formed — but if your schema requires every <book> to have a <price>, this XML is not valid:

<book>
  <title>Learning XML</title>
  <!-- missing <price> -->
</book>

DTD vs XSD: Two Schema Languages

Two ways to define the rules an XML document must follow:

DTD (Document Type Definition) — older, simpler

<!ELEMENT book (title, author, year, price)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ELEMENT year (#PCDATA)>
<!ELEMENT price (#PCDATA)>

XSD (XML Schema Definition) — modern, more powerful

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="book">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="title" type="xs:string"/>
        <xs:element name="author" type="xs:string"/>
        <xs:element name="year" type="xs:integer"/>
        <xs:element name="price" type="xs:decimal"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>

Use XSD for new projects — it has data types (string, integer, date), namespaces, and more validation power.

You'll see DTD in older systems and document formats (DocBook, some HTML doctypes).

Common Beginner Mistakes

1. Mismatched Case

<Customer>...</customer> is broken — XML is case-sensitive. Open and close must match exactly.

2. Forgetting the XML Declaration

Recommended at the top of every file:

<?xml version="1.0" encoding="UTF-8"?>

It's optional but helps tools handle character encoding correctly.

3. Special Characters Not Escaped

Five characters need escape sequences inside element content or attributes:

CharacterEscape Sequence
<&lt;
>&gt;
&&amp;
"&quot;
'&apos;

4. Multiple Root Elements

<!-- Broken — two root elements: -->
<customer>Alice</customer>
<customer>Bob</customer>

<!-- Fixed — wrap in a root: -->
<customers>
  <customer>Alice</customer>
  <customer>Bob</customer>
</customers>

5. Tag Names Starting with a Number

<1stcustomer> is invalid. XML element names must start with a letter or underscore.

Complete Learning Path

Work through the 13 chapters below in order. Each chapter is short — about 30 minutes of reading and practice.

Start Chapter 1: What is XML? →

XML Quick Reference Cheat Sheet

PatternSyntax
XML declaration<?xml version="1.0" encoding="UTF-8"?>
Element with content<name>Alice</name>
Empty element<br/>
Element with attribute<book id="123">...</book>
Comment<!-- This is a comment -->
CDATA section (raw text)<![CDATA[Anything < here is OK]]>
Namespace<ns:book xmlns:ns="http://...">
Reference to DTD<!DOCTYPE root SYSTEM "schema.dtd">
XSD root element<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
Escape less-than&lt;
Escape ampersand&amp;
Self-closing tag<br/> or <br />

Frequently Asked Questions

What is XML used for in 2026?

XML is still widely used for configuration files (Spring, Maven, Android manifests), document formats (Microsoft Office .docx, .xlsx are XML internally), SOAP web services, RSS/Atom feeds, SVG graphics, sitemaps, and cross-system data exchange. JSON has replaced XML in many web APIs, but XML remains dominant in enterprise systems.

Is XML hard to learn?

XML is one of the easiest technologies to learn. The core syntax is just opening tags, closing tags, and attributes — anyone who has seen HTML can read XML in minutes. Schemas (DTD, XSD), namespaces, and XSLT add complexity later, but the basics take an hour.

What's the difference between XML and HTML?

HTML is a fixed set of predefined tags for displaying web pages. XML lets you define your own tags for storing structured data. HTML focuses on how things look; XML focuses on what data means.

Should I learn XML or JSON?

Learn both. JSON is preferred for modern web APIs and JavaScript work. XML is essential for configuration files, document formats, SOAP services, and any work touching enterprise systems.

How long does it take to learn XML?

Basic XML syntax: 1-2 hours. Reading and writing XML confidently: a few days. Understanding DTD and XSD schemas: 1-2 weeks. Mastering XSLT, XPath, and namespaces: a few months.

Do I need software to write XML?

No. XML is plain text — Notepad on Windows or TextEdit on Mac is enough. For production work, VS Code, Notepad++, or Sublime Text with XML support helps with syntax highlighting and auto-closing tags.

What is well-formed vs valid XML?

Well-formed XML follows basic syntax rules. Valid XML is well-formed AND conforms to a schema (DTD or XSD). All valid XML is well-formed; not all well-formed XML is valid.

Is XML still relevant in modern web development?

Yes, in specific contexts. APIs increasingly use JSON, but XML is still required for SOAP services, RSS feeds, sitemaps, SVG, Office formats, Android manifests, Spring config, and Maven build files.

Start Chapter 1: What is XML? →

Last updated: April 25, 2026.