Case Study #01 · B2B / Finance

Federal Treasury statement parser → 1C: 2-3 hours → 5 seconds

An accountant at a government-contract company processed Federal Treasury GIS XML statements by hand every week, converting them to the 1C format. I built a Python parser for two protocol versions — the manual routine became a single batch.

Industry

Accounting at a government-contract company

Stack

Python · xml.etree · Win-1251

Timeline

~3 business days

Outcome

2-3 hours → 5 seconds

01 · Pain Point

Hours of manual XML parsing, repeating every month

An accountant at a government-contract company received XML statements weekly from the GIS "Electronic Budget" / Federal Treasury system. Each statement is a structured XML document with dozens of nested tags: recipient details, ORFK, legal entities, credit and debit amounts, balance positions per personal account.

To land in 1C, this data had to be transferred to the 1CClientBankExchange format — a bank-client text format in Windows-1251 encoding with a rigid structure of SectionDocument=... sections and dozens of required fields per payment.

In practice this looked like: open the XML in an editor, manually copy each field into the Excel mapping template Payment_Treasury_1C_mapping.xlsx, reconcile counterparties, check amounts against control points, convert to the right encoding, save the file — and the same for every payment in the statement.

02 · Solution

Python parser for two protocol versions → direct import to 1C

The architecture is simple and one-directional — hence the reliability. One CLI script reads the XML file, detects the protocol version, extracts all needed fields, runs them through the mapping, and writes the finished 1CClientBankExchange file in the correct encoding.

01

XML statement

File from Treasury GIS: V3 (TSE_BalanAcc_D13) or V4 (TSE_BalanAcc2_D13)

02

Parser

xml.etree.ElementTree, auto-detect version by root namespace

03

Mapping

BasicRequisites, ORFK, LegalEntity, balance_items → 1C fields

04

Serialization

1CClientBankExchange template, Windows-1251 encoding

05

1C

Direct load via the standard bank-client module

Two XML versions — one interface

The Treasury GIS uses two statement formats in parallel: V3 (TSE_BalanAcc_D13) and V4 (TSE_BalanAcc2_D13). Field structure and namespace differ. The parser detects the version by the root element and dispatches the corresponding extraction strategy — externally exposing one normalized dict.

Extraction of all meaningful fields

The parser pulls out not just the "header" but every payment + balance positions with control-sum verification:

·BasicRequisites — document details, period, statement number
·ORFK — Federal Treasury body
·LegalEntity — legal entity, INN/KPP, personal account
·SDTotalSum / EDTotalSum — debit/credit control sums
·balance_items — each balance position broken down by KBK

Windows-1251 encoding without surprises

1CClientBankExchange has historically required Windows-1251 — it's not an "option" but a hard requirement of the bank-client standard. The parser opens the output file with encoding="cp1251" and handles potential unmappable characters in advance — no garbled text on import.

03 · Stack

The stack is deliberately boring — because it's reliable

Python 3.11+

One executable script, zero cloud dependencies

xml.etree.ElementTree

Standard library, no extra packages

Windows-1251 / cp1251

Encoding required by 1CClientBankExchange

1CClientBankExchange

Bank-client exchange standard — native import to 1C

Excel mapping table

Payment_Treasury_1C_mapping.xlsx — field reference

CLI

Launch from command line or via .bat wrapper at the workstation

Pythonxml.etreeWindows-12511CClientBankExchangeCLIExcel-mapping

04 · Results

Comparison before and after

Processing time

2-3 hrs 5 sec

per statement of any size

Accuracy

100%

control sums match byte-for-byte, no typos

Protocol coverage

V3 + V4

both versions of the Treasury GIS XML format

The main win isn't even time. The main win is that the entire class of manual-entry errors is eliminated. Every field now comes from the source of truth in the XML directly, no "retyping the amount from the screen".

The accountant double-clicks the .bat wrapper, gets a ready-to-import file, and now checks reconciliation in 1C — where it should happen.

05 · Where it fits

Where else the same methodology applies

This case is not "a statement parser". It's a typical task of "structured document X → structured document Y through rigid mapping". The same architecture applies anywhere data moves between a state system and an accounting system:

→ Bank statements in 1C bank-client / SUFD format — same fields, different namespace
→ UPD / EDI documents from Diadoc / Kontur.Diadoc / SBIS → import to 1C Trade Management / Accounting
→ "Honest Sign" marking — XML reports into the accounting system
→ Tax returns / reports in non-standard FNS formats → internal spreadsheets
→ Tender documentation (XML exports from state portals) → CRM / internal Excel registries

What's reused on subsequent projects

CLI script template with format-version auto-detect by root element
Mapping via Excel reference table maintained by the accountant — no code changes
Control-sum verification before writing the file — fail-fast, errors don't reach accounting
Launch via .bat wrapper at the workstation — no Python for the end user

Similar challenge?

If you have a source document and a target document with a rigid format — it's solvable

Document workflow parsers are the most predictable class of automation. Time from first XML to production — 3-7 business days. ROI is back-of-the-envelope.

Discuss my project All cases

Готовы начать?

Аудит за 5 000 ₽ — с конкретным отчётом и сметой

Расскажу что внедрить в вашем бизнесе в первую очередь, какая будет окупаемость, и нужен ли вообще AI для вашей задачи (иногда — нет).

Записаться на аудит Написать в Telegram

Или просто напишите свой вопрос — отвечу в течение 2 часов