Add support for parsing values/quantities in accounting/comma format, where numbers are wrapped in parentheses.

This commit is contained in:
scoobybejesus 2020-12-12 23:15:07 -05:00
parent f6e9b5525b
commit ce77cbf8b9
7 changed files with 112 additions and 31 deletions

4
Cargo.lock generated
View File

@ -189,7 +189,7 @@ dependencies = [
[[package]]
name = "crptls"
version = "0.1.0"
version = "0.1.1"
dependencies = [
"chrono",
"csv",
@ -201,7 +201,7 @@ dependencies = [
[[package]]
name = "cryptools"
version = "0.11.0"
version = "0.11.1"
dependencies = [
"chrono",
"crptls",

View File

@ -1,6 +1,6 @@
[package]
name = "cryptools"
version = "0.11.0"
version = "0.11.1"
authors = ["scoobybejesus <scoobybejesus@users.noreply.github.com>"]
edition = "2018"
description = "Command-line utility for processing cryptocurrency transactions into 'lots' and 'movements'."

View File

@ -61,11 +61,12 @@ The rules for successfully preparing and maintaining the input file can generall
1. The first account must be given number `1`, and each additional account must count up sequentially.
2. `Proceeds` is the value of the transaction (measured in the home currency), whether spent, received, or exchanged.
It is **required** in order to properly calculate income/expense/gain/loss.
It is **required** in order to properly calculate income/expense/gain/loss, and it's always a positive number.
3. `Proceeds` must have a period as the decimal separator (`1,000.00` not `1.000,00`) and must not contain the ticker or symbol (USD or $).
4. Margin quote account `ticker`s must be followed by an underscore and the base account ticker (i.e., `BTC_xmr`).
5. Only home currency accounts can have negative balances. Non-margin crypto accounts may not go negative at any time.
(Exception: crypto margin accounts may go negative.)
6. There is now experimental support for values/quantities being in 'Accounting'/'comma' format, meaning negative numbers may be surrounded in parentheses.
As you can see, most of the rules can generally be ignored.
In fact, the only tricky field is the `proceeds` column, but even that becomes second nature soon enough.
@ -204,8 +205,8 @@ but be sure not to include the ticker or symbol of the currency
* *quantity*: This is similar to **proceeds**, in that the **decimal separator** must be a **period**,
and you *cannot* include the ticker or symbol of the currency in that field.
It is different from **proceeds** in that this will be parsed into a 128-bit precision decimal floating point number,
and a negative value can be indicated via a preceding `-`.
Negative values currently cannot be parsed if they are instead wrapped in parentheses (i.e., `(123.00)`).
and a negative value should be indicated via a preceding minus sign (`-`),
though experimental support now exists to parse negative values wrapped in parentheses (i.e., `(123.00)`).
##### Rows

View File

@ -58,21 +58,16 @@ when appreciated cryptocurrency was used to make a tax-deductible charitable con
import and may cause unintended rounding issues.
* Microsoft Excel. Don't let Excel cause you to bang your head against a wall.
Picture this scenario. You keep your transactions for your input file in a Google Sheet,
and you're meticulous about making sure it's perfect.
You then download it as a CSV file and import it into `cryptools`.
It works perfectly, and you have all your reports.
Then you realize you'd like to quickly change a memo and re-run the reports, so you open the CSV file in Excel and edit it.
Then you import it into `cryptools` again and the program panics!
What happened is most likely that Excel changed the rounding of your precise decimals underneath you!
Depending on the rounding, `cryptools` may think your input file has been incorrectly prepared
because you've supposedly spent more coins than you actually owned at that time.
`Cryptools` does not let you spend coins you don't own, and it will exit upon finding such a condition.
The program is right, and your data is right, but Excel modified your data, so the program crashed for "no reason."
The solution is to have Excel already open, then in the ribbon's Data tab, you'll import your CSV file "From Text."
You'll choose Delimited, and Comma, and then highlight every column and choose Text as the data type.
* Currently, does not build on Windows due to the Termion crate (used for the print menu).
`Cryptools` does not let you spend coins you don't own, and it will panic/exit upon discovering such a condition.
You may believe your data is perfect, but Excel will change the precision of your numbers from underneath you if you're not careful.
If automatic rounding causes your values/quantities to change, the data may then suggest you *are* spending coins you don't have.
You must take steps to account for this.
- All your transaction values/quantity must **not** be kept in 'General' formatting. Using 'numeric' or 'comma' is recommended.
- If opening a "correct" CSV that isn't otherwise formatted, instead go to the Data tab and import the CSV "From Text," avoiding 'General' as the data type.
- In either of these cases, for every cell with crypto transaction quantities/amounts, adjust rounding to view **8** decimal places.
- Excel writes numeric values to a CSV file as they appear in the cell, not their underlying actual value, so:
- Go into options and choose to "Set precision as displayed." This is found in different places in Mac and Windows.
- If your CSV Input File has MM-dd-YY date format, opening in Excel will change it to MM/dd/YY, so you'll have to pass the -d flag (or related `.env` variable).
## Installation
@ -101,7 +96,9 @@ To skip the wizard, there are three requirements:
`cryptools` will spit out an error message and then exit/panic if your CSV input file is malformed.
The error message will generally tell you why.
Consider using the python script (root directory of the repo) to sanitize your input file,
in case the file contains negative numbers in parentheses, numbers with commas, or extra rows/columns.
in case the file contains negative numbers in parentheses, numbers with commas, or extra rows/columns
(though now there is experimental support for 'Accounting'/'comma' number formatting,
meaning negative quantities can now be parsed even if indicated by parentheses instead of a minus sign).
See `/examples/` directory for further guidance,
or jump directly to the [examples.md](https://github.com/scoobybejesus/cryptools/blob/master/examples/examples.md) file.

View File

@ -1,6 +1,6 @@
[package]
name = "crptls"
version = "0.1.0"
version = "0.1.1"
authors = ["scoobybejesus <scoobybejesus@users.noreply.github.com>"]
edition = "2018"

View File

@ -165,7 +165,6 @@ fn import_transactions(
let mut this_tx_date: &str = "";
let mut this_proceeds: &str;
let mut this_memo: &str = "";
let mut this: String;
let mut proceeds_parsed = 0f32;
// Next, create action_records.
@ -180,10 +179,10 @@ fn import_transactions(
// Set metadata fields on first three fields.
if idx == 0 { this_tx_date = field; }
else if idx == 1 {
this = field.replace(",", "");
this_proceeds = this.as_str();
proceeds_parsed = this_proceeds.parse::<f32>()?;
let no_comma_string = field.replace(",", "");
proceeds_parsed = no_comma_string.parse::<f32>()?;
}
else if idx == 2 { this_memo = field; }
// Check for empty strings. If not empty, it's a value for an action_record.
@ -193,9 +192,26 @@ fn import_transactions(
let acct_idx = ind - 2; // acct_num and acct_key would be idx + 1, so subtract 2 from ind to get 1
let account_key = acct_idx as u16;
// TODO: implement conversion for negative numbers surrounded in parentheses
let amount_str = field.replace(",", "");
let amount = amount_str.parse::<d128>().unwrap();
let mut amount = amount_str.parse::<d128>().unwrap();
// When parsing to a d128, it won't error; rather it'll return a NaN. It must now check for NaN,
// and, if found, attempt to sanitize. These checks will convert accounting/comma format to the expected
// format by removing parentheses from negatives and adding a minus sign in the front. It will also
// attempt to remove empty spaces and currency symbols or designations (e.g. $ or USD).
if amount.is_nan() {
let b = sanitize_string_for_d128_parsing_basic(field).parse::<d128>().unwrap();
amount = b;
};
if amount.is_nan() {
let c = sanitize_string_for_d128_parsing_full(field).parse::<d128>().unwrap();
amount = c;
};
if amount.is_nan() {
println!("FATAL: Couldn't convert amount to d128 for transaction:\n{:#?}", record);
std::process::exit(1);
}
let amount_rounded = round_d128_1e8(&amount);
if amount != amount_rounded { changed_action_records += 1; changed_txn_num.push(this_tx_number); }
@ -219,6 +235,73 @@ fn import_transactions(
}
}
// Note: the rust Trait implementation of FromStr for f32 is capable of parsing:
// '3.14'
// '-3.14'
// '2.5E10', or equivalently, '2.5e10'
// '2.5E-10'
// '5.'
// '.5', or, equivalently, '0.5'
// 'inf', '-inf', 'NaN'
// Notable observations from the list:
// (a) scientific notation is accepted
// (b) accounting format (numbers in parens representing negative numbers) is not explicitly accepted
// Additionally notable:
// (a) the decimal separator must be a period
// (b) there can be no commas
// (c) there can be no currency info ($120 or 120USD, etc. will fail to parse)
// In summary, it appears to only allow: (i) numeric chars, (ii) a period, and/or (iii) a minus sign
//
// The Decimal::d128 implementation of FromStr calls into a C library, and that lib hasn't
// been reviewed (by me), but it is thought/hoped to follow similar parsing conventions,
// though there's no guarantee. Nevertheless, the above notes *appear* to hold true for d128.
fn sanitize_string_for_d128_parsing_basic(field: &str) -> String {
// First, remove commas.
let no_comma_string = field.replace(",", "");
let almost_done = no_comma_string.replace(" ", "");
// Next, if ASCII (better be), check for accounting formatting
if almost_done.is_ascii() {
if almost_done.as_bytes()[0] == "(".as_bytes()[0] {
let half_fixed = almost_done.replace("(", "-");
let negative_with_minus = half_fixed.replace(")", "");
return negative_with_minus
}
}
almost_done
}
fn sanitize_string_for_d128_parsing_full(field: &str) -> String {
let mut near_done = "".to_string();
// First, remove commas.
let no_comma_string = field.replace(",", "");
let almost_done = no_comma_string.replace(" ", "");
// Next, if ASCII (better be), check for accounting formating
if almost_done.is_ascii() {
if almost_done.as_bytes()[0] == "(".as_bytes()[0] {
let half_fixed = almost_done.replace("(", "-");
let negative_with_minus = half_fixed.replace(")", "");
near_done = negative_with_minus;
} else {
near_done = almost_done;
}
} else {
near_done = almost_done;
}
// Strip non-numeric and non-period characters
let all_done: String = near_done.chars()
.filter(|x|
x.is_numeric() |
(x == &(".".as_bytes()[0] as char)) |
(x == &("-".as_bytes()[0] as char)))
.collect();
all_done
}
if let Some(incoming_ar) = incoming_ar {
let x = incoming_ar_num.unwrap();
action_records.insert(x, incoming_ar);

View File

@ -74,8 +74,8 @@ pub struct Cli {
/// File to be imported. Some notes on the columns: (a) by default, the program expects the `txDate` column to
/// be formatted as %m-%d-%y. You may alter this with ISO_DATE and DATE_SEPARATOR_IS_SLASH flags or environment
/// variables; (b) the `proceeds` column and any values in transactions must have a period (".") as the decimal
/// separator; and (c) any transactions with negative values must not be wrapped in parentheses (use the python
/// script for sanitizing/converting negative values).
/// separator; and (c) there is now experimental support for negative values being wrapped in parentheses. Use
/// the python script for sanitizing/converting negative values if they are a problem.
/// See .env.example for further details on environment variables.
#[structopt(name = "file_to_import", parse(from_os_str))]
file_to_import: Option<PathBuf>,