XEROF

 

xlsgen 4.5.0.8 : CSV, HTML, JSON, XML : charset and language


Build 4.5.0.8 of xlsgen adds support for charset and language for the 4 import file formats : CSV, HTML, JSON and XML.

By default, xlsgen infers data types using the user's regional settings when it comes to the language, and it relies on charset markup built in files, whenever applicable, to parse files.

Regarding language, such as en_US, en_GB and fr_FR, those affect how data type inference recognizes numbers, currencies and dates.

When you know in advance the file being imported is of a given language, you can pass it to xlsgen before importing it by setting the following property :

worksheet.Import.CSV.Options.Language = "fr_FR"; // example of custom language used in an imported CSV file

And this option is equally available for HTML, JSON and XML files and buffers.

Worth noting the syntax of the language parameter. This is made of the primary language initials, followed by an underscore, and the secondary language initials. As such, US English behaves differently than British English. It's normalized as RFC 1766.

Regarding charsets, it's a bit more involved, because charset may be present or not in each file being imported, and specs vary depending on the file format.

- CSV file : the charset can be implicit for Unicode 2 and Unicode UTF-8 with the presence of BOM markers at the beginning of the file. xlsgen already handles BOM. Otherwise it is assumed the charset is the user's current code page. This can be overridden by setting the following property :

worksheet.Import.CSV.Options.Charset = "iso-8859-1"; // example of custom charset used in an imported CSV file

- XML file : the charset is explicit in the XML markup, in the first line. This can be overridden by setting the following property :

worksheet.Import.XML.Options.Charset = "iso-8859-1"; // example of custom charset used in an imported XML file

- JSON file : the charset is defaulting to Unicode UTF-8. This can be overridden by setting the following property :

worksheet.Import.JSON.Options.Charset = "iso-8859-1"; // example of custom charset used in an imported JSON file

- HTML file : the charset is explicit in the HTML markup, in optional meta HTTP equiv markup. This can be overridden by setting the following property :

worksheet.Import.HTML.Options.Charset = "iso-8859-1"; // example of custom charset used in an imported HTML file

When any of those files are imported from the internet, the HTTP response headers have a charset spec too, that is seen and passed along by xlsgen. But the custom charset setting always override everything else.

Posted on 24-March-2018 17:48 | Category: xlsgen, Excel generator | comment[0] | trackback[0]

 

 

<-- previous page

< July >
0102030405
0607080910
1112131415
1617181920
2122232425
2627282930
31



 

 

This site
Home
Articles

DevTools
CPU-Z
EditPlus
ExplorerXP
Kill.exe
OllyDbg
DependencyWalker
Process Explorer
autoruns.exe
Araxis
COM Trace injection
CodeStats
NetBrute
FileMon/Regmon
BoundsChecker
AQTime profiler
Source monitor
GDI leaks tracking
Rootkit revealer
Rootkit removal
RunAsLimitedUser(1)
RunAsLimitedUser(2)

 

 

Liens
Le Plan B
Un jour à Paris
Meneame
Rezo.net (aggr)
Reseau voltaire
Cuba solidarity project
Le grand soir
L'autre journal
Le courrier suisse
L'Orient, le jour
Agoravox (aggr)