xlsxecute
xlsxecute is a new library from Jataware that allows you to "run" your Excel (.xlsx) based models directly from the command line using Python.
This is acheived by identifying the parameters that can change, providing the means to change those parameters without opening Excel, rendering the spreadsheet to a static format, and calculating every formula in every cell.
The result is a nearly-identical .xlsx file, except now the spreadsheet's cells contain only values, all without even requiring Excel to be installed.
But why Excel?
It's easy for professional software engineers to scoff at the idea of a simple script to "run Excel", but in reality, from the world of business to academia and all in between, much of the world's "numerical knowledge" is wrapped up in Excel spreadsheets.
And why shouldn't they be? Excel, and spreadsheets in general, are an excellent tool and continue to be a silent driver of productivity. Millions of people around the world use Excel on a regular basis for any number of tasks. This familiarity makes it the obvious tool to use any time they need to work on anything that requires numerical data or any math beyond the scope of the basic calculator app on their phone.
Excel is easy enough for pretty much anyone to pick up, but has enough functionality and depth to be used by scientists , economists , and major businesses .
With the ability to store both data and formulas in one document, Excel files allow for a portable and reproducible way to share analysis and results, all while (in a convoluted way, to be sure) showing your work.
Not to mention, sometimes spreadsheets can be hilarious.
But why not JUST Excel?
Ok, if Excel is so great, why not just use that? Why do we need to introduce Python to the party?
As we've established, an Excel spreadsheet is an excellent way to store data, formulas, and other information in a highly shareable way, but actually working with the spreadsheet is a different matter. To even view the spreadsheet, you need to have the right software, and in the case of Excel, to have purchased the right software.
Once you have Excel, to do anything with it, you have to open up a graphical interface. Making changes requires selecting cells, entering edit mode, and typing or pasting in the new value. This isn't onerous if you're the only one working on the spreadsheet, your results don't have to be fed into another system, and you don't have to make too many iterations of your spreadsheet.
But all that changes once your hard work is recognized and your carefully crafted Excel masterpiece now needs to be integrated into a bigger workflow.
Where this tool shines is in situations where you need an Excel model to be part of a larger workflow, but don't want to rewrite all your formulas in Python or R or some other tool. With this, you can programmatically collect data to copy into the spreadsheet, calculate everything, and then extract your results without ever touching your mouse.
xlsxecute can also be used inside a web app or API to provide dynamic responses based on an existing Excel formulas.
If you ever have cause to wonder if (or how) you can include your Excel formulas inside an Airflow or Prefect data processing flow, this is the tool for you.
How to use xlsxecute
This tool will take an Excel model (.xlsx), update any parameters as defined via command line arguments or in the parameters file, and calculate all cells, resulting in an Excel spreadsheet resembling the original, but with all formula cells replaced by the calculated values.
Parameters that define how to update cells in your spreadsheet can be provided in three ways: JSON file, CSV file, or command line arguments
If both a config file and command line arguments are provided, the command line arguments take precedence over the parameters in the config files.
Command line arguments:
-f "Sheet name.Cell1=Replacement value string" -f "Sheet name.Cell2=Replacement_value_float"
Note: Quotation marks are not required if there is no space in the parameter string.
Example:
-f "Variables.C2=red" -f Variables.C3=0.8
Config file formatting
Only one config file can be provided at a time. The config file can either be formatted as JSON or CSV (useful if you need to export directly from Pandas or a similar data processing flow).
JSON:
{
"Sheet name.Cell1": "Replacement value string",
"Sheet name.Cell2": Replacement_value_float
}
Example: params.json
{
"Variables.C2": "red",
"Variables.C3": 0.8
}
CSV:
Sheet name.Cell1,Replacement value string
Sheet name.Cell2,Replacement value float
Example: params.csv
Variables.C2,red
Variables.C3,0.8
NOTE: Do NOT include a header row in the CSV
Executable usage:
usage: xlsxecute [-h] [--output_dir OUTPUT_DIR] [--run_dir RUN_DIR] [--param {sheet}.{cell}={new_value}] source_file [parameter_file]
positional arguments:
source_file Excel (xlsx) file that contains the model
parameter_file Path to json or csv parameter file
optional arguments:
-h, --help show this help message and exit
--output_dir OUTPUT_DIR
Optional output location. (Default: output)
--run_dir RUN_DIR Optional directory to store intermediate files. (Default: runs)
--param {sheet}.{cell}={new_value}, -p {sheet}.{cell}={new_value}
Installation
Installation is done via the pip command
pip install xlsxecute
This will install all requirements and register the script to run.
You should now be able to run xlsxecute as described above.