La conversion de XML a JSON es una de las tareas de transformacion de datos mas comunes en el desarrollo de software moderno. Ya sea que procese archivos de configuracion, feeds RSS o respuestas de API SOAP, un conversor eficiente de XML a JSON ahorra horas de trabajo manual. Esta guia cubre todo, desde los mecanismos de analisis hasta ejemplos de codigo en JavaScript, Python, Java y Bash.
Prueba nuestra herramienta gratuita de conversion XML a JSON en linea.
Que es la conversion de XML a JSON?
XML y JSON son los dos formatos de intercambio de datos mas utilizados. XML domino la informatica empresarial durante decadas. JSON surgio como la alternativa ligera preferida por los desarrolladores web.
Un conversor de XML a JSON toma un documento XML bien formado y lo transforma en una representacion JSON equivalente, gestionando elementos, atributos, espacios de nombres y secciones CDATA.
La operacion inversa, la conversion de JSON a XML, es igualmente importante al integrarse con sistemas XML existentes.
XML vs JSON: comparacion detallada
Comprender las diferencias estructurales entre XML y JSON es esencial:
| Caracteristica | XML | JSON |
|---|---|---|
| Sintaxis | Etiquetas de apertura/cierre | Pares clave-valor |
| Atributos | Soporte nativo | Sin concepto de atributos |
| Espacios de nombres | Soporte completo via xmlns | Sin soporte nativo |
| Tamano | Mayor | 30-50% mas pequeno |
Como funciona la conversion XML a JSON
El proceso de conversion involucra varios pasos clave:
- Analizar el documento XML: Construccion de un arbol DOM o eventos SAX.
- Mapear elementos a objetos: Cada elemento se convierte en propiedad JSON.
- Manejar atributos: Uso de prefijos como
@. - Procesar nodos de texto: Mapeo directo a valores de cadena.
- Convertir elementos repetidos en arrays.
- Manejar espacios de nombres.
Varias convenciones definen el mapeo XML a JSON:
- Convencion Badgerfish: Preserva toda la informacion XML.
- Convencion Parker: JSON mas compacto.
- Convencion GData: Solucion intermedia usada por Google.
Ejemplos de codigo XML a JSON
JavaScript: XML a JSON
En JavaScript, fast-xml-parser y xml2js ofrecen conversion robusta:
// ===== Using fast-xml-parser (recommended) =====
// npm install fast-xml-parser
import { XMLParser, XMLBuilder } from 'fast-xml-parser';
const xml = `<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book id="1" category="fiction">
<title lang="en">The Great Gatsby</title>
<author>F. Scott Fitzgerald</author>
<price currency="USD">10.99</price>
<year>1925</year>
</book>
<book id="2" category="non-fiction">
<title lang="en">Sapiens</title>
<author>Yuval Noah Harari</author>
<price currency="USD">14.99</price>
<year>2011</year>
</book>
</bookstore>`;
// Configure parser with attribute handling
const parser = new XMLParser({
ignoreAttributes: false, // preserve attributes
attributeNamePrefix: '@_', // prefix for attributes
textNodeName: '#text', // key for text content
isArray: (name, jpath) => { // force arrays for known collections
return ['bookstore.book'].includes(jpath);
},
});
const json = parser.parse(xml);
console.log(JSON.stringify(json, null, 2));
// Output:
// {
// "bookstore": {
// "book": [
// {
// "@_id": "1",
// "@_category": "fiction",
// "title": { "@_lang": "en", "#text": "The Great Gatsby" },
// "author": "F. Scott Fitzgerald",
// "price": { "@_currency": "USD", "#text": 10.99 },
// "year": 1925
// },
// ...
// ]
// }
// }
// ===== Using DOMParser (browser built-in) =====
function xmlToJson(xmlString) {
const parser = new DOMParser();
const doc = parser.parseFromString(xmlString, 'text/xml');
function nodeToJson(node) {
const obj = {};
// Handle attributes
if (node.attributes && node.attributes.length > 0) {
for (let i = 0; i < node.attributes.length; i++) {
const attr = node.attributes[i];
obj['@' + attr.nodeName] = attr.nodeValue;
}
}
// Handle child nodes
if (node.childNodes && node.childNodes.length > 0) {
for (let i = 0; i < node.childNodes.length; i++) {
const child = node.childNodes[i];
if (child.nodeType === 1) { // Element node
const childObj = nodeToJson(child);
if (obj[child.nodeName]) {
// Convert to array if duplicate element names
if (!Array.isArray(obj[child.nodeName])) {
obj[child.nodeName] = [obj[child.nodeName]];
}
obj[child.nodeName].push(childObj);
} else {
obj[child.nodeName] = childObj;
}
} else if (child.nodeType === 3) { // Text node
const text = child.nodeValue.trim();
if (text) {
if (Object.keys(obj).length === 0) return text;
obj['#text'] = text;
}
} else if (child.nodeType === 4) { // CDATA section
obj['#cdata'] = child.nodeValue;
}
}
}
return obj;
}
const root = doc.documentElement;
const result = {};
result[root.nodeName] = nodeToJson(root);
return result;
}
// ===== Using xml2js (Node.js) =====
// npm install xml2js
import { parseString } from 'xml2js';
parseString(xml, {
explicitArray: false,
mergeAttrs: true,
trim: true,
}, (err, result) => {
if (err) throw err;
console.log(JSON.stringify(result, null, 2));
});
// ===== Streaming large XML with sax (Node.js) =====
// npm install sax
import sax from 'sax';
const saxParser = sax.createStream(true, { trim: true });
const stack = [];
let current = {};
saxParser.on('opentag', (node) => {
const obj = {};
if (node.attributes) {
for (const [key, value] of Object.entries(node.attributes)) {
obj['@' + key] = value;
}
}
stack.push(current);
current[node.name] = obj;
current = obj;
});
saxParser.on('text', (text) => {
if (text.trim()) current['#text'] = text.trim();
});
saxParser.on('closetag', () => {
current = stack.pop();
});Python: XML a JSON
En Python, xmltodict es la opcion mas popular:
# ===== Using xmltodict (most popular) =====
# pip install xmltodict
import xmltodict
import json
xml_string = """<?xml version="1.0" encoding="UTF-8"?>
<catalog>
<product id="101" category="electronics">
<name>Wireless Mouse</name>
<price currency="USD">29.99</price>
<specs>
<weight unit="g">85</weight>
<battery>AA</battery>
<connectivity>Bluetooth 5.0</connectivity>
</specs>
<tags>
<tag>wireless</tag>
<tag>mouse</tag>
<tag>bluetooth</tag>
</tags>
</product>
</catalog>
"""
# Convert XML to Python dict (then to JSON)
data = xmltodict.parse(xml_string)
json_output = json.dumps(data, indent=2, ensure_ascii=False)
print(json_output)
# Output:
# {
# "catalog": {
# "product": {
# "@id": "101",
# "@category": "electronics",
# "name": "Wireless Mouse",
# "price": { "@currency": "USD", "#text": "29.99" },
# "specs": {
# "weight": { "@unit": "g", "#text": "85" },
# "battery": "AA",
# "connectivity": "Bluetooth 5.0"
# },
# "tags": { "tag": ["wireless", "mouse", "bluetooth"] }
# }
# }
# }
# Force specific elements to always be lists
data = xmltodict.parse(xml_string, force_list=('product', 'tag'))
# ===== Using defusedxml for security =====
# pip install defusedxml
import defusedxml.ElementTree as ET
# Safe parsing - blocks XXE, entity expansion, etc.
tree = ET.fromstring(xml_string)
def element_to_dict(element):
result = {}
# Handle attributes
if element.attrib:
for key, value in element.attrib.items():
result[f'@{key}'] = value
# Handle child elements
children = list(element)
if children:
for child in children:
child_data = element_to_dict(child)
if child.tag in result:
# Convert to list for repeated elements
if not isinstance(result[child.tag], list):
result[child.tag] = [result[child.tag]]
result[child.tag].append(child_data)
else:
result[child.tag] = child_data
elif element.text and element.text.strip():
if result: # Has attributes
result['#text'] = element.text.strip()
else:
return element.text.strip()
return result
root = tree
json_data = {root.tag: element_to_dict(root)}
print(json.dumps(json_data, indent=2))
# ===== Using lxml with XPath =====
# pip install lxml
from lxml import etree
tree = etree.fromstring(xml_string.encode())
# Extract specific data with XPath, output as JSON
products = []
for product in tree.xpath('//product'):
products.append({
'id': product.get('id'),
'name': product.xpath('name/text()')[0],
'price': float(product.xpath('price/text()')[0]),
'currency': product.xpath('price/@currency')[0],
})
print(json.dumps(products, indent=2))Bash / CLI: XML a JSON
xq (de yq) y xmlstarlet permiten la conversion en linea de comandos:
# ===== Using xq (part of yq, recommended) =====
# Install: pip install yq OR brew install yq
# Basic XML to JSON conversion
cat data.xml | xq .
# Or directly from a file
xq . data.xml
# Pretty-print with specific fields
xq '.catalog.product[] | {name: .name, price: .price}' data.xml
# Convert and save to file
xq . input.xml > output.json
# Extract specific values
xq -r '.catalog.product.name' data.xml
# ===== Using xmlstarlet =====
# Install: brew install xmlstarlet OR apt install xmlstarlet
# Select specific elements
xmlstarlet sel -t -v "//product/name" data.xml
# Convert to a flat key-value format
xmlstarlet sel -t \
-m "//product" \
-v "@id" -o "," \
-v "name" -o "," \
-v "price" -n data.xml
# ===== Python one-liners =====
# Quick XML to JSON from command line
python3 -c "
import xmltodict, json, sys
print(json.dumps(xmltodict.parse(sys.stdin.read()), indent=2))
" < data.xml
# Using built-in xml.etree (no pip install needed)
python3 -c "
import xml.etree.ElementTree as ET, json, sys
root = ET.parse(sys.stdin).getroot()
def to_dict(el):
d = dict(el.attrib)
children = list(el)
if children:
for c in children:
cd = to_dict(c)
if c.tag in d:
if not isinstance(d[c.tag], list): d[c.tag] = [d[c.tag]]
d[c.tag].append(cd)
else: d[c.tag] = cd
elif el.text and el.text.strip():
if d: d['#text'] = el.text.strip()
else: return el.text.strip()
return d
print(json.dumps({root.tag: to_dict(root)}, indent=2))
" < data.xml
# ===== Using curl + xq for API responses =====
# Fetch XML API and convert to JSON
curl -s "https://api.example.com/data.xml" | xq .
# SOAP response to JSON
curl -s -X POST "https://api.example.com/soap" \
-H "Content-Type: text/xml" \
-d @request.xml | xq '.Envelope.Body'Java: XML a JSON
En Java, Jackson XML y org.json son las herramientas principales:
// ===== Using org.json (simple conversion) =====
// Maven: org.json:json:20231013
import org.json.JSONObject;
import org.json.XML;
public class XmlToJsonExample {
public static void main(String[] args) {
String xml = """
<bookstore>
<book id="1">
<title>Clean Code</title>
<author>Robert C. Martin</author>
<price>32.99</price>
</book>
</bookstore>
""";
// Simple one-line conversion
JSONObject json = XML.toJSONObject(xml);
System.out.println(json.toString(2));
// With configuration
JSONObject jsonKeepStrings = XML.toJSONObject(xml, true);
// true = keep all values as strings (no type coercion)
}
}
// ===== Using Jackson XML (more control) =====
// Maven: com.fasterxml.jackson.dataformat:jackson-dataformat-xml
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.dataformat.xml.XmlMapper;
public class JacksonXmlExample {
public static void main(String[] args) throws Exception {
String xml = "<book id=\"1\"><title>Clean Code</title></book>";
XmlMapper xmlMapper = new XmlMapper();
JsonNode node = xmlMapper.readTree(xml.getBytes());
ObjectMapper jsonMapper = new ObjectMapper();
String json = jsonMapper
.writerWithDefaultPrettyPrinter()
.writeValueAsString(node);
System.out.println(json);
}
}
// ===== Secure XML parsing in Java =====
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.XMLConstants;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
// Prevent XXE attacks
dbf.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
dbf.setFeature(
"http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature(
"http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature(
"http://xml.org/sax/features/external-parameter-entities", false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);Conversion JSON a XML
La direccion inversa presenta sus propios desafios:
Desafios clave: Los arrays JSON no tienen equivalente XML directo. Los valores null necesitan representacion. Los nombres de propiedades pueden ser invalidos en XML.
Ejemplos de conversion JSON a XML en JavaScript y Python:
// ===== JavaScript: JSON to XML =====
import { XMLBuilder } from 'fast-xml-parser';
const jsonData = {
catalog: {
product: [
{
'@_id': '101',
name: 'Wireless Mouse',
price: { '@_currency': 'USD', '#text': '29.99' },
tags: { tag: ['wireless', 'mouse'] },
},
{
'@_id': '102',
name: 'Keyboard',
price: { '@_currency': 'USD', '#text': '49.99' },
tags: { tag: ['keyboard', 'mechanical'] },
},
],
},
};
const builder = new XMLBuilder({
ignoreAttributes: false,
attributeNamePrefix: '@_',
textNodeName: '#text',
format: true, // pretty print
indentBy: ' ',
suppressEmptyNode: true,
});
const xml = builder.build(jsonData);
console.log(xml);
# ===== Python: JSON to XML =====
import xmltodict
json_data = {
'catalog': {
'product': {
'@id': '101',
'name': 'Wireless Mouse',
'price': {'@currency': 'USD', '#text': '29.99'},
}
}
}
xml_output = xmltodict.unparse(json_data, pretty=True)
print(xml_output)
# Output:
# <?xml version="1.0" encoding="utf-8"?>
# <catalog>
# <product id="101">
# <name>Wireless Mouse</name>
# <price currency="USD">29.99</price>
# </product>
# </catalog>Manejo de casos extremos
Los conversores de calidad produccion deben manejar numerosos casos extremos:
Atributos XML: El caso extremo mas comun con convenciones de prefijo.
Secciones CDATA: Contenido sin analizar tratado como texto normal.
Espacios de nombres: Aumentan la complejidad del mapeo.
Contenido mixto: Elementos con texto y subelementos.
Elementos vacios: Pueden convertirse en null, cadena vacia u objeto vacio.
Deteccion de arrays: Determinar cuando usar arrays JSON.
Manejo de espacios en blanco: Distinguir espacios significativos.
Declaracion XML y DTD: Metadatos sin mapeo JSON.
// Edge case examples: XML to JSON
// 1. Attributes + text content
// XML: <price currency="USD">29.99</price>
// JSON: { "price": { "@currency": "USD", "#text": "29.99" } }
// 2. CDATA section
// XML: <script><![CDATA[if (a < b) { alert("hello"); }]]></script>
// JSON: { "script": "if (a < b) { alert(\"hello\"); }" }
// 3. Namespaces
// XML: <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
// <soap:Body><GetPrice><Item>Apple</Item></GetPrice></soap:Body>
// </soap:Envelope>
// JSON: { "soap:Envelope": { "soap:Body": { "GetPrice": { "Item": "Apple" } } } }
// 4. Mixed content
// XML: <p>Hello <b>world</b> today</p>
// JSON: { "p": { "#text": ["Hello ", " today"], "b": "world" } }
// 5. Self-closing / empty elements
// XML: <br/> OR <item></item>
// JSON: { "br": null } OR { "item": "" }
// 6. Single vs multiple children (array detection)
// XML (one child): <items><item>A</item></items>
// JSON (no array): { "items": { "item": "A" } }
// XML (two children):<items><item>A</item><item>B</item></items>
// JSON (array): { "items": { "item": ["A", "B"] } }
// Solution: use isArray option in fast-xml-parser or force_list in xmltodict
// 7. Whitespace preservation
// XML: <code xml:space="preserve"> hello world </code>
// JSON: { "code": " hello world " }
// 8. XML declaration (dropped in JSON)
// XML: <?xml version="1.0" encoding="UTF-8"?>
// JSON: (not included in output)Mejores practicas de seguridad XML
La seguridad es primordial al trabajar con analizadores XML:
Ataques XXE: Deshabilite el procesamiento de entidades externas.
Ataque Billion Laughs: Limite la expansion de entidades.
Inyeccion via DTD: Deshabilite la carga de DTD externas.
Configuracion segura: Cada lenguaje requiere configuracion de seguridad explicita para los analizadores XML.
// ===== Secure XML parsing examples =====
// --- Python: Use defusedxml ---
# UNSAFE: xml.etree.ElementTree (vulnerable to XXE)
import xml.etree.ElementTree as ET # DO NOT use with untrusted XML
# SAFE: defusedxml blocks all known XML attacks
import defusedxml.ElementTree as SafeET
tree = SafeET.fromstring(untrusted_xml) # Safe!
# defusedxml blocks:
# - XML External Entity (XXE) attacks
# - Billion Laughs (entity expansion) attacks
# - External DTD retrieval
# - Decompression bombs
# --- Java: Secure DocumentBuilderFactory ---
import javax.xml.parsers.DocumentBuilderFactory;
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
// Disable all dangerous features
dbf.setFeature(
"http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature(
"http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature(
"http://xml.org/sax/features/external-parameter-entities", false);
dbf.setFeature(
"http://apache.org/xml/features/nonvalidating/load-external-dtd",
false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);
// --- JavaScript (Node.js): fast-xml-parser ---
import { XMLParser } from 'fast-xml-parser';
const secureParser = new XMLParser({
// fast-xml-parser does NOT process entities by default (safe)
processEntities: false, // explicitly disable
htmlEntities: false, // don't process HTML entities
allowBooleanAttributes: false,
});
// --- .NET: Secure XmlReaderSettings ---
// XmlReaderSettings settings = new XmlReaderSettings();
// settings.DtdProcessing = DtdProcessing.Prohibit;
// settings.XmlResolver = null;
// === Example: XXE attack payload (for awareness) ===
// This malicious XML attempts to read /etc/passwd:
//
// <?xml version="1.0"?>
// <!DOCTYPE data [
// <!ENTITY xxe SYSTEM "file:///etc/passwd">
// ]>
// <data>&xxe;</data>
//
// A vulnerable parser would include file contents in output.
// A secure parser rejects the DOCTYPE declaration entirely.
// === Example: Billion Laughs payload ===
// This XML expands to ~3 GB of text from a few hundred bytes:
//
// <?xml version="1.0"?>
// <!DOCTYPE lolz [
// <!ENTITY lol "lol">
// <!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
// <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
// ...
// ]>
// <data>&lol9;</data>Preguntas frecuentes
Cual es la mejor forma de convertir XML a JSON?
Depende del lenguaje. JavaScript: fast-xml-parser. Python: xmltodict. CLI: xq. Para conversion rapida, use una herramienta en linea.
Como se manejan los atributos XML en JSON?
Los atributos se mapean con prefijos como @. El texto se almacena en una clave #text cuando coexiste con atributos.
Es segura la conversion XML a JSON?
El analisis XML puede ser peligroso sin configuracion adecuada. Deshabilite entidades externas, use defusedxml en Python y configure los parsers de forma segura.
La conversion XML a JSON es una habilidad fundamental para desarrolladores modernos. Use nuestra herramienta gratuita para conversiones rapidas.
Convierta XML a JSON al instante con nuestra herramienta gratuita en linea. | XML Formatter | JSON Formatter