Parsing XML file in Node.js
I am using an Arch Linux system with KDE plasma. I have approximately 50mb XML, and I need to parse it. The file has custom tags.
Example XML:
<JMdict>
<entry>
<ent_seq>1000000</ent_seq>
<r_ele>
<reb>ヽ</reb>
</r_ele>
<sense>
<pos>&unc;</pos>
<gloss g_type="expl">repetition mark in katakana</gloss>
</sense>
</entry>
</JMdict>
I have tried many solutions that were suggested on Stack Overflow, and they did not work at all, and some of them could not installed to my system like xml-stream
, xml2json
. I decided to use xml2js
(most of them suggest to use xml2js
), and got the same result. How can I correctly use it ?
I am using this code but it always returns undefined:
const fs = require('fs-extra');
const xml2js = require('xml2js');
const parser = new xml2js.Parser();
const path = "test.xml";
fs.readFile(path, {encoding: 'utf-8'}, function(error, data) {
parser.parseString(data, function(err, res) {
console.log(res);
});
});
Result: Undefined
Is there any way to handle an XML file by hand (without a package)?
javascript node.js xml
add a comment |
I am using an Arch Linux system with KDE plasma. I have approximately 50mb XML, and I need to parse it. The file has custom tags.
Example XML:
<JMdict>
<entry>
<ent_seq>1000000</ent_seq>
<r_ele>
<reb>ヽ</reb>
</r_ele>
<sense>
<pos>&unc;</pos>
<gloss g_type="expl">repetition mark in katakana</gloss>
</sense>
</entry>
</JMdict>
I have tried many solutions that were suggested on Stack Overflow, and they did not work at all, and some of them could not installed to my system like xml-stream
, xml2json
. I decided to use xml2js
(most of them suggest to use xml2js
), and got the same result. How can I correctly use it ?
I am using this code but it always returns undefined:
const fs = require('fs-extra');
const xml2js = require('xml2js');
const parser = new xml2js.Parser();
const path = "test.xml";
fs.readFile(path, {encoding: 'utf-8'}, function(error, data) {
parser.parseString(data, function(err, res) {
console.log(res);
});
});
Result: Undefined
Is there any way to handle an XML file by hand (without a package)?
javascript node.js xml
1
Your "XML" file is not well-formed: it contains an undefined entity reference&unc;
. So parsing should fail.
– Michael Kay
Jan 1 at 18:58
add a comment |
I am using an Arch Linux system with KDE plasma. I have approximately 50mb XML, and I need to parse it. The file has custom tags.
Example XML:
<JMdict>
<entry>
<ent_seq>1000000</ent_seq>
<r_ele>
<reb>ヽ</reb>
</r_ele>
<sense>
<pos>&unc;</pos>
<gloss g_type="expl">repetition mark in katakana</gloss>
</sense>
</entry>
</JMdict>
I have tried many solutions that were suggested on Stack Overflow, and they did not work at all, and some of them could not installed to my system like xml-stream
, xml2json
. I decided to use xml2js
(most of them suggest to use xml2js
), and got the same result. How can I correctly use it ?
I am using this code but it always returns undefined:
const fs = require('fs-extra');
const xml2js = require('xml2js');
const parser = new xml2js.Parser();
const path = "test.xml";
fs.readFile(path, {encoding: 'utf-8'}, function(error, data) {
parser.parseString(data, function(err, res) {
console.log(res);
});
});
Result: Undefined
Is there any way to handle an XML file by hand (without a package)?
javascript node.js xml
I am using an Arch Linux system with KDE plasma. I have approximately 50mb XML, and I need to parse it. The file has custom tags.
Example XML:
<JMdict>
<entry>
<ent_seq>1000000</ent_seq>
<r_ele>
<reb>ヽ</reb>
</r_ele>
<sense>
<pos>&unc;</pos>
<gloss g_type="expl">repetition mark in katakana</gloss>
</sense>
</entry>
</JMdict>
I have tried many solutions that were suggested on Stack Overflow, and they did not work at all, and some of them could not installed to my system like xml-stream
, xml2json
. I decided to use xml2js
(most of them suggest to use xml2js
), and got the same result. How can I correctly use it ?
I am using this code but it always returns undefined:
const fs = require('fs-extra');
const xml2js = require('xml2js');
const parser = new xml2js.Parser();
const path = "test.xml";
fs.readFile(path, {encoding: 'utf-8'}, function(error, data) {
parser.parseString(data, function(err, res) {
console.log(res);
});
});
Result: Undefined
Is there any way to handle an XML file by hand (without a package)?
javascript node.js xml
javascript node.js xml
edited Jan 1 at 16:03
jonrsharpe
77.8k11105213
77.8k11105213
asked Jan 1 at 15:53
Kaan Taha KökenKaan Taha Köken
1811315
1811315
1
Your "XML" file is not well-formed: it contains an undefined entity reference&unc;
. So parsing should fail.
– Michael Kay
Jan 1 at 18:58
add a comment |
1
Your "XML" file is not well-formed: it contains an undefined entity reference&unc;
. So parsing should fail.
– Michael Kay
Jan 1 at 18:58
1
1
Your "XML" file is not well-formed: it contains an undefined entity reference
&unc;
. So parsing should fail.– Michael Kay
Jan 1 at 18:58
Your "XML" file is not well-formed: it contains an undefined entity reference
&unc;
. So parsing should fail.– Michael Kay
Jan 1 at 18:58
add a comment |
3 Answers
3
active
oldest
votes
Answer is below Working Example Link
var fs = require('fs'),
slash = require('slash'),
xml2js = require('xml2js');
var parser = new xml2js.Parser();
let filename = slash(__dirname+'/foo.xml');
// console.log(filename);
fs.readFile(filename, "utf8", function(err, data) {
if(err) {
console.log('Err1111');
console.log(err);
} else {
//console.log(data);
// data.toString('ascii', 0, data.length)
parser.parseString(data.replace(/&(?!(?:apos|quot|[gl]t|amp);|#)/g, '&'), function (err, result) {
if(err) {
console.log('Err');
console.log(err);
} else {
console.log(JSON.stringify(result));
console.log('Done');
}
});
}
});
Exact you have to do it below :
data.replace(/&(?!(?:apos|quot|[gl]t|amp);|#)/g, '&')
Problem is below tag only &unc;
<pos>&unc;</pos>
Referenced And Thanks to @tim
add a comment |
The way you use the xml2js package should be fine. However, the format of your xml is a little bit off.
if you add a console.log
to see what's causing the error
fs.readFile(path, {encoding: 'utf-8'}, function(error, data) {
parser.parseString(data, function(err, res) {
if (err) console.log(err);
console.log(res);
});
});
You'll see that it's the line <pos>&unc;</pos>
that causes the problem.
If you fix the HTML entities, the parser should works fine.
add a comment |
I think your problem is unescaped characters in your xml data.
I'm able to get your example to work by using this:
xml data:
<JMdict>
<entry>
<ent_seq>1000000</ent_seq>
<r_ele>
<reb>ヽ</reb>
</r_ele>
<sense>
<pos>YOUR PROBLEM WAS HERE</pos>
<gloss g_type="expl">repetition mark in katakana</gloss>
</sense>
</entry>
node.js code:
const fs = require('fs-extra');
const xml2js = require('xml2js');
const parser = new xml2js.Parser();
const path = "test.xml";
fs.readFile(path, {encoding: 'utf-8'}, function(error, data) {
parser.parseString(data, function(err, res) {
console.log(JSON.stringify(res.JMdict.entry, null, 4));
});
});
In situations like this, when I know it should work fine, I always look at the data and for any possible issues with the input data.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53996838%2fparsing-xml-file-in-node-js%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Answer is below Working Example Link
var fs = require('fs'),
slash = require('slash'),
xml2js = require('xml2js');
var parser = new xml2js.Parser();
let filename = slash(__dirname+'/foo.xml');
// console.log(filename);
fs.readFile(filename, "utf8", function(err, data) {
if(err) {
console.log('Err1111');
console.log(err);
} else {
//console.log(data);
// data.toString('ascii', 0, data.length)
parser.parseString(data.replace(/&(?!(?:apos|quot|[gl]t|amp);|#)/g, '&'), function (err, result) {
if(err) {
console.log('Err');
console.log(err);
} else {
console.log(JSON.stringify(result));
console.log('Done');
}
});
}
});
Exact you have to do it below :
data.replace(/&(?!(?:apos|quot|[gl]t|amp);|#)/g, '&')
Problem is below tag only &unc;
<pos>&unc;</pos>
Referenced And Thanks to @tim
add a comment |
Answer is below Working Example Link
var fs = require('fs'),
slash = require('slash'),
xml2js = require('xml2js');
var parser = new xml2js.Parser();
let filename = slash(__dirname+'/foo.xml');
// console.log(filename);
fs.readFile(filename, "utf8", function(err, data) {
if(err) {
console.log('Err1111');
console.log(err);
} else {
//console.log(data);
// data.toString('ascii', 0, data.length)
parser.parseString(data.replace(/&(?!(?:apos|quot|[gl]t|amp);|#)/g, '&'), function (err, result) {
if(err) {
console.log('Err');
console.log(err);
} else {
console.log(JSON.stringify(result));
console.log('Done');
}
});
}
});
Exact you have to do it below :
data.replace(/&(?!(?:apos|quot|[gl]t|amp);|#)/g, '&')
Problem is below tag only &unc;
<pos>&unc;</pos>
Referenced And Thanks to @tim
add a comment |
Answer is below Working Example Link
var fs = require('fs'),
slash = require('slash'),
xml2js = require('xml2js');
var parser = new xml2js.Parser();
let filename = slash(__dirname+'/foo.xml');
// console.log(filename);
fs.readFile(filename, "utf8", function(err, data) {
if(err) {
console.log('Err1111');
console.log(err);
} else {
//console.log(data);
// data.toString('ascii', 0, data.length)
parser.parseString(data.replace(/&(?!(?:apos|quot|[gl]t|amp);|#)/g, '&'), function (err, result) {
if(err) {
console.log('Err');
console.log(err);
} else {
console.log(JSON.stringify(result));
console.log('Done');
}
});
}
});
Exact you have to do it below :
data.replace(/&(?!(?:apos|quot|[gl]t|amp);|#)/g, '&')
Problem is below tag only &unc;
<pos>&unc;</pos>
Referenced And Thanks to @tim
Answer is below Working Example Link
var fs = require('fs'),
slash = require('slash'),
xml2js = require('xml2js');
var parser = new xml2js.Parser();
let filename = slash(__dirname+'/foo.xml');
// console.log(filename);
fs.readFile(filename, "utf8", function(err, data) {
if(err) {
console.log('Err1111');
console.log(err);
} else {
//console.log(data);
// data.toString('ascii', 0, data.length)
parser.parseString(data.replace(/&(?!(?:apos|quot|[gl]t|amp);|#)/g, '&'), function (err, result) {
if(err) {
console.log('Err');
console.log(err);
} else {
console.log(JSON.stringify(result));
console.log('Done');
}
});
}
});
Exact you have to do it below :
data.replace(/&(?!(?:apos|quot|[gl]t|amp);|#)/g, '&')
Problem is below tag only &unc;
<pos>&unc;</pos>
Referenced And Thanks to @tim
edited Jan 1 at 17:58
answered Jan 1 at 16:37
KittaKitta
348
348
add a comment |
add a comment |
The way you use the xml2js package should be fine. However, the format of your xml is a little bit off.
if you add a console.log
to see what's causing the error
fs.readFile(path, {encoding: 'utf-8'}, function(error, data) {
parser.parseString(data, function(err, res) {
if (err) console.log(err);
console.log(res);
});
});
You'll see that it's the line <pos>&unc;</pos>
that causes the problem.
If you fix the HTML entities, the parser should works fine.
add a comment |
The way you use the xml2js package should be fine. However, the format of your xml is a little bit off.
if you add a console.log
to see what's causing the error
fs.readFile(path, {encoding: 'utf-8'}, function(error, data) {
parser.parseString(data, function(err, res) {
if (err) console.log(err);
console.log(res);
});
});
You'll see that it's the line <pos>&unc;</pos>
that causes the problem.
If you fix the HTML entities, the parser should works fine.
add a comment |
The way you use the xml2js package should be fine. However, the format of your xml is a little bit off.
if you add a console.log
to see what's causing the error
fs.readFile(path, {encoding: 'utf-8'}, function(error, data) {
parser.parseString(data, function(err, res) {
if (err) console.log(err);
console.log(res);
});
});
You'll see that it's the line <pos>&unc;</pos>
that causes the problem.
If you fix the HTML entities, the parser should works fine.
The way you use the xml2js package should be fine. However, the format of your xml is a little bit off.
if you add a console.log
to see what's causing the error
fs.readFile(path, {encoding: 'utf-8'}, function(error, data) {
parser.parseString(data, function(err, res) {
if (err) console.log(err);
console.log(res);
});
});
You'll see that it's the line <pos>&unc;</pos>
that causes the problem.
If you fix the HTML entities, the parser should works fine.
answered Jan 1 at 16:35
Ray ChanRay Chan
47919
47919
add a comment |
add a comment |
I think your problem is unescaped characters in your xml data.
I'm able to get your example to work by using this:
xml data:
<JMdict>
<entry>
<ent_seq>1000000</ent_seq>
<r_ele>
<reb>ヽ</reb>
</r_ele>
<sense>
<pos>YOUR PROBLEM WAS HERE</pos>
<gloss g_type="expl">repetition mark in katakana</gloss>
</sense>
</entry>
node.js code:
const fs = require('fs-extra');
const xml2js = require('xml2js');
const parser = new xml2js.Parser();
const path = "test.xml";
fs.readFile(path, {encoding: 'utf-8'}, function(error, data) {
parser.parseString(data, function(err, res) {
console.log(JSON.stringify(res.JMdict.entry, null, 4));
});
});
In situations like this, when I know it should work fine, I always look at the data and for any possible issues with the input data.
add a comment |
I think your problem is unescaped characters in your xml data.
I'm able to get your example to work by using this:
xml data:
<JMdict>
<entry>
<ent_seq>1000000</ent_seq>
<r_ele>
<reb>ヽ</reb>
</r_ele>
<sense>
<pos>YOUR PROBLEM WAS HERE</pos>
<gloss g_type="expl">repetition mark in katakana</gloss>
</sense>
</entry>
node.js code:
const fs = require('fs-extra');
const xml2js = require('xml2js');
const parser = new xml2js.Parser();
const path = "test.xml";
fs.readFile(path, {encoding: 'utf-8'}, function(error, data) {
parser.parseString(data, function(err, res) {
console.log(JSON.stringify(res.JMdict.entry, null, 4));
});
});
In situations like this, when I know it should work fine, I always look at the data and for any possible issues with the input data.
add a comment |
I think your problem is unescaped characters in your xml data.
I'm able to get your example to work by using this:
xml data:
<JMdict>
<entry>
<ent_seq>1000000</ent_seq>
<r_ele>
<reb>ヽ</reb>
</r_ele>
<sense>
<pos>YOUR PROBLEM WAS HERE</pos>
<gloss g_type="expl">repetition mark in katakana</gloss>
</sense>
</entry>
node.js code:
const fs = require('fs-extra');
const xml2js = require('xml2js');
const parser = new xml2js.Parser();
const path = "test.xml";
fs.readFile(path, {encoding: 'utf-8'}, function(error, data) {
parser.parseString(data, function(err, res) {
console.log(JSON.stringify(res.JMdict.entry, null, 4));
});
});
In situations like this, when I know it should work fine, I always look at the data and for any possible issues with the input data.
I think your problem is unescaped characters in your xml data.
I'm able to get your example to work by using this:
xml data:
<JMdict>
<entry>
<ent_seq>1000000</ent_seq>
<r_ele>
<reb>ヽ</reb>
</r_ele>
<sense>
<pos>YOUR PROBLEM WAS HERE</pos>
<gloss g_type="expl">repetition mark in katakana</gloss>
</sense>
</entry>
node.js code:
const fs = require('fs-extra');
const xml2js = require('xml2js');
const parser = new xml2js.Parser();
const path = "test.xml";
fs.readFile(path, {encoding: 'utf-8'}, function(error, data) {
parser.parseString(data, function(err, res) {
console.log(JSON.stringify(res.JMdict.entry, null, 4));
});
});
In situations like this, when I know it should work fine, I always look at the data and for any possible issues with the input data.
edited Jan 1 at 16:50
answered Jan 1 at 16:43
tamaktamak
9251232
9251232
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53996838%2fparsing-xml-file-in-node-js%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Your "XML" file is not well-formed: it contains an undefined entity reference
&unc;
. So parsing should fail.– Michael Kay
Jan 1 at 18:58