can a system be built to examine the text? tbd.
02_control_data_integrity
WORKPAPERS ARE OPTIMIZED FOR MACHINE READABILITY
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 01: lists all workpapers within control 02
id: 01___workpaper_summary___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: workpaper summary for control 02: data integrity control
2_source:
class: provenance
value: builder authored
3_dependencies:
class: dependency
value: not applicable, this is a workpaper summary
4_exclusions:
class: exclusion
value: not applicable, this is a workpaper summary
5_subject_matter:
class: substance
value: expanded section
workpapers (machine readable + human verifiable)
01.01___workpaper_summary___d
02.01___control_overview___d
02.02___control_container_excel___d
02.03___control_example_excel___d
03.01___sha256_concepts___d
03.02___sha256_considerations___d
03.03___sha256_source___d
04.01___vba_id___d
04.02___vba_container___d
04.03___vba_environment_requirements___d
04.04___vba_script___d
05.01___python_id___d
05.02___python_container___d
05.03___python_environment_requirements___d
05.04___python_script___d
6_constraints:
class: constraint
value: not applicable, control policy documentation only
7_reproducibility:
class: reproducibility
value: not applicable, controls subjectively implemented by builder
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 02: control overview
id: 02.01___control_overview___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: overview of control 02: data integrity control
2_source:
class: provenance
value: builder designed control (no external standard asserted)
3_dependencies:
class: dependency
value: not applicable, this is a workpaper summary
4_exclusions:
class: exclusion
value: expanded section
control design subjectivity
control creation is subjectively designed by builder - another builder may choose a different control
errors introduced prior to static computation generation are outside scope
correctness of source data
this control does not validate correctness of source data
errors introduced prior to static computation generation are outside scope
intentional manipulation
the control does not detect intentional data manipulation occurring prior to static computation generation
other human error scope
the control does not prevent or correct human error outside the specified control process
semantic meaning
this control does not detect semantic equivalence
semantic equivalence = meaning of control data
5_subject_matter:
class: substance
value: expanded section
assurances
if the control passes static and live computation results are equal at the evaluated scope
this provides assurance that no detectable change has occurred to the selected data since static computation was generated
this assurance applies only to the exact data selection encoding and boundaries used
this assurance does not extend beyond the evaluated scope
this assurance does not assert correctness completeness or intent
cell error/non-value handling
if a cell is #N/A, #VALUE!, #DIV/0!, etc. the cell return is addressed at the occurance level
no resolution occurs within this control workpaper
resolution, if any, is done at the dataset level
evaluation and resolution are out of scope
change detection logic
change is detected when static and live computation results are not equal
detection is binary (change = 1 / no change = 0)
a value of 1 indicates the presence of change only; it does not quantify the number or magnitude of changes
no conclusion, diagnosis, or cause of change is interpreted within this logic
change resolution
detected changes are addressed at the occurrence level
no resolution occurs within this control
evaluation and resolution are out of scope
defined
control compares a static computation to a live computation for the purpose of detecting a change (if any)
changes detected by manually reviewing the change-indicator calculation cell(s), where:
static computation result = live computation result → 0
static computation result ≠ live computation result → 1
results are reviewed manually by the user
data selection defined by user and varies
computation is SHA-256
execution
static computation is generated once and stored; live computation is continuously derived from current data
equality comparison is executed by formula; interpretation is performed by the user
detected differences and confirmations of no change are subject to manual review
the control operates continuously and is not limited to a point-in-time execution
implementation safeguards (optional)
optional safeguards may be applied at implementation time
examples (non-exhaustive):
- freezing control columns
- protection against accidental formula overwrite
- separate storage of static computation
- table-based data selection
safeguard selection and implementation may vary by:
- database
- dataset
- execution environment (excel, vba, python)
absence of safeguards does not affect:
- change detection logic
- computation correctness
- sha-256 determinism
input basis
control computes SHA-256 over raw cell values, not displayed or formatted values
scope
the control applies to any user-defined data selection, including but not limited to:
single symbol(s), cell(s), column(s), row(s), table(s), or file-level outputs
control tests output equivalence only
the control operates on the selected data as a whole, not on semantic meaning
scope is defined at execution time by the user or system configuration
6_constraints:
class: constraint
value: expanded section
algorithm dependency
this control relies on deterministic computation, not on a specific algorithm
the computation algorithm used by the control may change over time
algorithm replacement does not alter the control logic or comparison method
excel related dependencies
correct execution of the computation function
integrity of the spreadsheet environment
nature of manual review control
this control requires manual review
manual processes are by nature subject to risk
unintentional manipulation
the control is designed to detect accidental and unintended changes
the control may surface intentional changes incidentally
7_reproducibility:
class: reproducibility
value: not applicable, this is a control overview
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 02: control overview
id: 02.02___control_container___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: documents microsoft excel as the execution, storage, and review container for control
2_source:
class: provenance
value: microsoft office (commercial software product)
3_dependencies:
class: dependency
value: not applicable, workpaper is to document container not control itself
4_exclusions:
class: exclusion
value: not applicable, workpaper is to document container not control itself
5_subject_matter:
class: substance
value: expanded section
configuration considerations
excel behavior may vary based on:
excel version
regional settings
calculation mode (automatic vs manual)
workbook protection settings
these factors may affect reproducibility and must be controlled or acknowledged by user
container role
microsoft excel serves as the execution, storage, and review container for the control 02
excel provides:
a structured grid for data selection
formula-based computation and comparison
persistent storage of static computation results
visual inspection of change-indicator outputs
excel does not define the control logic or computation mechanism
execution context
the control 02 executes within excel via worksheet formulas
execution characteristics:
static computation results are stored as fixed values in cells
live computation results are continuously derived from current cell values
equality comparison is executed by formula
change-indicator output is visible immediately upon data modification
no batch job or scheduled execution is required
storage behavior
excel stores:
source data values
static computation results
live computation formulas
change-indicator formulas
excel does not enforce immutability of stored values
persistence relies on standard workbook save behavior
user interaction and review
the user:
defines data selection boundaries
initiates static computation generation
observes change-indicator outputs
performs manual review when change is detected
excel provides visibility but does not perform interpretation or resolution
6_constraints:
class: constraint
value: not applicable, workpaper is to document container not control itself
7_reproducibility:
class: reproducibility
value: not applicable, workpaper is to document container not control itself
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 02: control overview
id: 02.03___control_example_excel___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: provide an illustrative example of control 02
2_source:
class: provenance
value: builder-authored
3_dependencies:
class: dependency
value: not applicable, workpaper is illustrative only
4_exclusions:
class: exclusion
value: not applicable, workpaper is illustrative only
5_subject_matter:
class: substance
value: expanded
example control
illustrates control at illustrative level only
example demonstrates how control 02 detects change, not how sha256 is computed
formulas in "row 1" are descriptive not live excel formulas
control output rule:
Output 0 when static and live sha256 values are identical
Output 1 when static and live sha256 values differ
control illustrative example
illustrative data = hdkslHSKD
static sha-256 stored sha-256 computation result
live sha-256=computed SHA-256 computation
computation formula=sha256_text(hdksIHSKD)
static sha-256=e6386e29cc15b45e434e6c2316cecee7a79ac82ef068d1d390de52cf9472d960
live sha-256=e6386e29cc15b45e434e6c2316cecee7a79ac82ef068d1d390de52cf9472d960
control formula = IF(input=input,0,1)
=[static] = [live] = 0
=[static] ≠ [live] = 1
control result = 0 (two sha-256 computations agree
notes:
stored result execution method varies - i.e. vba in excel, python
there are multiple excel formulas for calculation of sha256 computation
another formula could be used to identify differences in sha256 static and live compuations
6_constraints:
class: constraint
value: not applicable, workpaper is illustrative only
7_reproducibility:
class: reproducibility
value: not applicable, workpaper is illustrative only
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section orientation: 03_workpapers___sha256_computation_independent_of_ execution_ method
workpaper id: 03.01___sha256_concepts___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: describe concepts related to sha-256 computation
2_source:
class: provenance
value: builder- authored
3_dependencies:
class: dependency
value: not applicable, concepts only - execution of control out of scope
4_exclusions:
class: exclusion
value: the internal mathematical operations that perform the sha-256 computation are out of scope for this documentation
5_subject_matter:
class: substance
value: expanded section
concepts grouped functionally, not alphabetically
5.1 concept notes: not concept specific
descriptions are intentionally brief
descriptions grouped by function not alphabetical
descriptions are functional, not dictionary-based
concepts included were subjectively chosen by builder with GPT assistance
5.2 primitives: smallest units a system operates on directly
primitive
primitive the smallest unit a system treats as indivisible for its intended purpose within that system’s abstraction boundary
may have meaning, but is not decomposed further within the system
combined to build more complex structures, but are not broken down further within the system
5.3 display primitives: what is converted to computer primitives
symbol
a character (ie. A, 9, space, :, emoji, etc.)
some are human readable, and some are not (whitespace, non-printing characters, etc.)
symbols (sha-256)
not processed directly by sha-256
are interpreted only after encoding converts them into bytes
data scope
defined by explicit selection and boundary prior to computation by the end user
can be a single character, an invisible character (spaces, line breaks, formatting), a word, a row, a table, a file, a folder, etc.
can include invisible characters (spaces, line breaks, formatting)
5.4 computer primitives: what computers see/interpret
bit
single binary value (0 or 1) — the smallest unit of data representation
the smallest unit computers use to represent information
byte
a group of bits
how computers group bits meaningfully
one byte (grouping of 8) has 256 possible patterns (arrangements of 0s and 1s)
byte pattern
the sequence ordered by bits (0s and 1s)
computers map representation by the arrangement of bits into bytes
5.5 encoding: process of translating human readable symbols to computer readable bytes
encoding
can be likened to translating - from symbol to bytes (groups of bits - 0s and 1s)
encoding inputs: symbols identified by character set (Unicode code points or equivalent) prior to encoding
the mapping of symbols to byte patterns
encoding systems
stable schemas or frameworks that map symbol(s) to bytes (patterns of bits)
the same symbols is represented with different byte patterns by different schemas
byte patterns must be stable within an encoding framework for results to be deterministic and comparable
code unit
the fixed-size unit of storage used by an encoding scheme to represent code points
not a character or a symbol
it is an intermediate representation between symbols and raw bytes
ode units are defined by the encoding scheme, not by sha-256
UTF-8 uses 8-bit code units; UTF-16 uses 16-bit code units
one code point may map to one or more code units depending on the encoding
UTF-8 (encoding)
a type of byte mapping system where symbols (Unicode code points) are mapped to one or more 8-bit code units (8 bits - 1 byte)
symbol-to-byte mapping is stable across all UTF-8 systems
a given code point always maps to the same byte sequence under UTF-8
the mapping is deterministic (identical inputs = identical byte patterns)
the mapping is reversible (byte pattern to symbol and symbol to byte pattern)
UTF-16 (encoding)
a type of byte mapping system where symbols (Unicode code points) are mapped to one or two 16-bit code units (16 bites = 2 byte) (surrogate pairs for code points above U+FFFF)
symbol-to-byte mapping is stable across all UTF-16 systems
a given code point maps to the same code unit sequence under UTF-16
the mapping is deterministic (identical inputs = identical byte patterns) and reversible
the mapping is reversible (byte pattern to symbol and symbol to byte pattern)
5.6 sha-256 algorithm input concepts
input
whatever symbols are being delivered to sha-256 algorithm must be bytes first
sha-256 does not know what encoding system is used, it recognizes bytes patterns only
sha-256 cannot see symbols, words, or meaning—only byte sequences
boundaries for byte input(s) are defined by user (look at word, row, columns a + b, entire table, etc.)
input and encoding
different encodings ⇒ different byte arrays ⇒ different sha-256 output
Same encoding + same symbols ⇒ same bytes ⇒ same sha-256 output
assumes same byte order and concatenation order
possibility of collisions noted and deemed out of scope
byte stream construction
the order in which bytes are concatenated is significant
the same bytes in a different order produce a different result
e.g., concatenating columns A+B vs B+A produces different byte streams and different hashes
5.7 sha-256 algorithm and computation concepts
algorithm
a fixed set of rules for transforming input into output
the rules do not change based on content
only the input affects the result
deterministic
same inputs (same bytes) produce the same result
change in input bytes produces a different result except for cryptographic collisions
different encoding systems produce different results because they produce different byte patterns for the same symbol(s)
sha-256 algorithm
A fixed, deterministic algorithm that converts input bytes into a 32-byte output
SHA = Secure Hash Algorithm
256 bits fixed output (32 bytes × 8 bits per byte)
if algorithm changes, it is no longer sha-256
sha-256 computation
execution of the sha-256 algorithm
sha-256 determinism
assumes a fixed encoding and consistent byte stream construction
sha-256 stabilitysha-256 text specification has clarifications, but algorithm definition has not changed
any change would create a different algorithm, not sha-256
this stability is critical for compatibility, verification, and security
SHA -256 reversibility
raw output is non-reversible due to the many-to-one mapping from arbitrary-length inputs (effectively infinite possibilities) to a finite, fixed-length output space
the non-reversibility of the output makes the algorithm one way
5.8 sha-256 computation other factors (salting)
salting (explicitly excluded)
salt = extra bytes deliberately added to the input before sha-256 computation
no additional bytes are appended to input prior to sha-256 computation
extra data (salt) would cause identical inputs to produce different outputs when salt differs
salt is used to:
prevent precomputed (rainbow table) attacks
ensure identical passwords do not hash to the same value across different records
increase resistance to precomputation attacks
salt is good for passwords
salt is bad for deterministic comparison controls and will not be used
5.9 sha-256 computation other factors (keying)
keying (explicitly excluded)
key = secret value combined with input during sha-256 computation
key is applied deliberately before or during computation
keyed sha-256 computation produces different outputs for identical inputs when keys differ
how keying is used:
authentication and message integrity (e.g., HMAC)
ensures only parties with the key can reproduce or verify the hash
used to prove authenticity, not just detect change
why keying exists:
prevents unauthorized hash forgery
binds the hash result to possession of a secret
protects against tampering in adversarial environments
why keying is excluded here:
deterministic comparison requires identical inputs → identical outputs
secret keys would prevent independent precomputation
control is not performing authentication or trust enforcement
key management is out of scope
5.10 control position - keying and/or salting
no salt added
no secret keys used
sha-256 executed without key material
output comparability preserved across environments
5.11 sha-256 algorithm output concepts (machine readable)
output generalsha-256 takes bytes as input and produces bytes as output
different encoding systems produce different byte patterns
different byte input produces different byte output
32 byte output the raw result produced by the sha-256 algorithm
this output is always exactly 32 bytes (fixed output)
32 bytes x 8 bits per byte = 256 bits as in sha-256
raw output would be a grouping of 256 0s and 1s
below is made up example of one 32 byte output - 32 sets of 8 bits (0s and 1s)
01101010 00001001 11101010 11110110 00010101 11110010 10100101 11000111
00011010 10110101 11101101 00001111 10001001 00101010 10101011 00100101
01101010 00001001 11101010 11110110 00010101 11110010 10100101 11000111
00011010 10110101 11101101 00001111 10001001 00101010 10101011 00100101
5.12 representation change from 32 byte output to human readable (not done by sha-256)
problem with 32 byte output
raw output difficult for human to read and store
hexadecimal number system with a base of 16
decimal -- base 10 -- digits 0-9
hex -- 6 units -- a-f = 11-15
64 character hexadecimal string
a standard representation of the sha-256 output (after computation)
the 32-byte output is converted into a 64-character hexadecimal string
the conversion methodology is stable - same 32 byte patterns produce same 64 character hexadecimal strings
more human readable and easier to store
each byte is represented by a combination of 2 characters
32 bytes x 2 characters per representation = 64 characters
the characters are referred to as hexadecimal because they are 0-9 and a-f specifically
the hexadecimal representation is reversible - can be converted back into bytes
fingerprint (raw)fixed 32 byte sha-256 output
fingerprint (display)the 32 byte output is converted and displayed as a 64 hexadecimal character string
6_constraints:
class: constraint
value: expanded section
algorithm longevity
SHA-256 is widely supported and not considered broken
future algorithms may supersede SHA-256 for some use cases
this documentation reflects the current state of SHA-256 usage
cryptographic collision risk
collision = two different byte inputs → same output
note it’s mathematically possible, practically infeasible to find for sha-256
accepted as outside the scope
7_reproducibility:
class: reproducibility
value: concepts based on builder (subjective), not intended for reproducibility
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 03: sha-256 computation independent of execution method
id: 03.01___sha256_concepts___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value:
2_source:
class: provenance
value:
3_dependencies:
class: dependency
value:
4_exclusions:
class: exclusion
value:
5_subject_matter:
class: substance
value:
6_constraints:
class: constraint
value:
7_reproducibility:
class: reproducibility
value:
8_notes:
class: note
value:
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 03: sha-256 computation independent of execution method
id: 03.02___sha256_considerations___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: describe concepts related to sha-256 computation
2_source:
class: provenance
value: builder- authored
3_dependencies:
class: dependency
value: expanded
3.1 algorithm / standard evolution
shared computation assumes consistent algorithm version and standard interpretation
future deprecation or revision of hashing standards may require coordinated update
mismatch may occur if one side migrates algorithms earlier than the other
3.2 implicit defaults introduced by execution environment
definition
an environment may apply defaults unless explicitly overridden
examples (not exhaustive):
newline normalization
Unicode normalization defaults
implicit type coercion
why this is shared-computation-specific:
shared computation assumes explicitness
implicit behavior violates that assumption
risk exists even with correct code
4_exclusions:
class: exclusion
value: expanded
concepts grouped functionally, not alphabetically
incomplete constraint enumeration (identification of all possible constraints)
if static and live computations differ, root cause diagnosis is difficult because a required shared constraint may be missing or undocumented
diagnostic ambiguity (consequence risk, not cause)
definition
a hash mismatch does not indicate which constraint failed
further investigation is required
why this is shared-computation-specific:
shared computation collapses many dimensions into one output
the control is binary, diagnosis is not
human interpretation of “shared”
definition - “shared computation” may be misinterpreted as “same intent” rather than “same rules”
why this is shared-computation-specific:
future builders may assume equivalence without verification
creates false confidence
5_subject_matter:
class: substance
value: expanded
concepts grouped functionally, not alphabetically
5.1 constraint notes: scope & intent
constraints define conditions required for valid comparison
listed constraints are functional, not exhaustive
absence of a listed constraint does not imply it is permitted
implementation details may vary by environment
5.2 inputs (what is included)
inputs are explicitly defined; unspecified inputs are not included
scope boundary definition
- exact cells included
- order of inputs preserved
- inclusion and exclusion rules explicitly defined
input value source
- raw cell values used
- displayed or formatted values excluded
empty vs null policy
- blank cell handling explicitly defined
- distinction between empty ("") and null (if applicable) explicitly stated
data type coercion rules
- numbers, dates, booleans coerced to text using defined rules
- formatting rules explicitly stated
- implicit type inference not applied
5.3 transformations (how symbols become bytes)
transformations are deterministic and explicitly defined; unspecified transformations are not applied
encoding mandate
UTF-16LE
UTF = Unicode Transformation Format
unicode = a universal catalog of characters (letters, numbers, symbols)
transformation format = rules for converting Unicode characters into bytes
LE = little endian
referring to the byte order
when a value uses more than one byte, the system must define byte order:
least significant byte first (little endian) or most significant byte first (big endian)
applied uniformly to entire selected data scope
no BOM unless explicitly specified
BOM = byte order mark
BOM = a small sequence of bytes placed at the start of a byte stream
purpose of BOM is to signal what encoding is being used and/or signal byte order
BOM bytes, if present, are included in the byte stream and therefore affect SHA-256 output
Unicode normalization
- normalization form: explicitly stated
- NFC / NFD / NFKC / NFKD or none
- normalization occurs before encoding
- absence of normalization = input code point sequence preserved as-is
case-folding policy
- none (unless explicitly applied)
- uppercase/lowercase transformations are out of scope unless specified
implicite exclusions
- no locale-based transformations
- no language-aware processing
- no semantic interpretation
byte stream construction (order + separators)
concatenation order
- explicitly defined
- order is significant (A+B ≠ B+A)
- row / column / table order preserved as selected
- no implicit reordering
delimiter policy
- explicitly defined
- delimiter value stated or none
- delimiter escaping rules explicitly stated or none
whitespace handling
- explicitly defined
- spaces, tabs, line breaks included or excluded as stated
- trimming applied or not applied as stated
newline canonicalization
- explicitly defined
- LF or CRLF
- mixed newline handling explicitly stated or none
BOM / null terminator policy
- explicitly defined
- present or absent
- no implicit insertion or removal
serialization and range ordering rules
plain language explanation of difference:
Serialization = how each brick is made
Range ordering = the order the bricks are stacked
serialization rules: how individual values and structures are converted into bytes
(types, delimiters, whitespace, encoding, normalization, etc.)
- serialization = rules that convert structured selection → linear byte stream input
- identical selection + different serialization → different byte streams → different sha-256 output
- serialization rules define:
value representation (text vs number vs date vs boolean)
empty vs null policy
delimiter policy (none vs separator; escape rules)
whitespace handling (trim vs preserve; tabs/spaces)
newline canonicalization (lf vs crlf)
order policy (row order, column order, concatenation order)
bom / terminator policy (present/absent)
- mismatch in serialization rules invalidates comparison
- comparison validity requires: same selection + same encoding + same serialization
range ordering rules: the sequence in which serialized values are concatenated into the byte stream (row-major vs column-major, multi-area order, merged cells, traversal order)
selection unit: cell-by-cell (not “values only”)
traversal order: row-major (top→bottom, left→right)
within-row order: increasing column index
within-column progression: increasing row index
multi-area ranges (discontiguous): process areas in order, then row-major within each area
tables / structured refs: resolve to the underlying cell range, then apply the same rule
entire row/column selections: define the bounded range explicitly (or forbid them)
merged cells: define whether you read the top-left cell only or every address in the merge (pick one and state it)
comparison method (what exactly is compared)
comparison method (what exactly is compared)
- comparison is performed on SHA-256 computation outputs
- comparison evaluates output equivalence only
identical outputs → no change
different outputs → change detected
- output representation
comparison uses raw SHA-256 output values
- if displayed values are compared:
hexadecimal representation must be consistent
hex character case explicitly defined (lowercase vs uppercase)
no spacing or formatting differences permitted
exclusions
- no locale-dependent interpretation
decimal separators
date formats
thousands separators
- no semantic interpretation of values
- no tolerance or fuzzy matching
comparison basis
- exact value equality
- byte-for-byte equivalence after computation and representation rules are applied
6_constraints:
class: constraint
value: expanded
constraint drift across implementations
static and live implementations may evolve independently
constraint adherence may diverge over time
mismatch may appear only at comparison time
shared-computation limitations
both sides can be “correct” in isolation
comparison validity requires identical rule application
risk statements (valid as constraints, not notes)
risk exists even with correct code
false confidence possible without verification
7_reproducibility:
class: reproducibility
value: lists considerations that impact comparability and reproducibility
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 03: sha-256 computation independent of execution method
id: 03.03___sha256_source___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: documents authoritative source of the sha-256 algorithm
2_source:
class: provenance
value: see section 5 for source of sha-256
3_dependencies:
class: dependency
value: not applicable, sha-256 source identification only
4_exclusions:
class: exclusion
value: not applicable, sha-256 source identification only
5_subject_matter:
class: substance
value: expanded
source sha-256
national security agency (nsa) designed sha-2 (includes sha-256)
national institute of standards and technology (nist) standardized and published sha-256
nist:: publisher and maintainer of u.s. cryptographic standards (including fips)
nist publishes standards used by governments, companies, and internet at large
standard on sha-256 is fips 180-4 (first released: 2001)
fips = federal information processing standard
fips 180-4
fips 180-4 defines the sha-2 family, including sha-256
Input preprocessing
Padding rules
Bitwise operations
Constants
Output format
document maintained at: https://csrc.nist.gov/pubs/fips/180-4/upd1/final
changes to fips 180-4
the document is actively maintained and updated by nist
sha-256 algorithm/computation does not change
the standard (containing documentation on sha-256 computation) can be updated
common uses
data integrity verification (file checksums)
password hashing (often combined with salt and key stretching)
digital signatures
blockchain & cryptocurrencies (e.g., Bitcoin mining)
secure authentication systems
6_constraints:
class: constraint
value: nist may change fips 180-4 standard language (algorithm computation remains stable)
7_reproducibility:
class: reproducibility
value: reproducibility not applicable, purpose is source identification
8_notes:
class: note
value: nist may change fips 180-4 website location at any time
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 04: vba as sha-256 execution method (runs within excel)
id: 04.01___vba_id___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: document visual basic for applications (vba) as an optional execution method used to compute sha-256 within di01
2_source:
class: provenance
value: vba is embedded within microsoft excel and executes in the excel application context (not a standalone runtime)
3_dependencies:
class: dependency
value: vba remains accessible through excel
4_exclusions:
class: exclusion
value: wp is documenting vba as possible computing mechanism within di01 only, execution and specific vba to compute sha-256 out of scope of workpaper
5_subject_matter:
class: substance
value: expanded
vba subject matter documented
vba is a programming language available within excel
sha-256 computed via vba or system libraries
external library invocation permitted
vba-produced sha-256 value may be generated as a live output and then stored as the static baseline by copying values
6_constraints:
class: constraint
value: not applicable, identification of execution method only
7_reproducibility:
class: reproducibility
value: not applicable to this workpaper; execution reproducibility addressed at dataset level workpapers
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 04: vba as sha-256 execution method (runs within excel)
id: 04.02___vba_container___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: document excel as the container for visual basic for applications (vba)
2_source:
class: provenance
value: microsoft office excel (vendor-provided software)
3_dependencies:
class: dependency
value: excel continues to offer vba capability
4_exclusions:
class: exclusion
value: no vba execution logic, performance characteristics, or output validation documented
5_subject_matter:
class: substance
value: excel as container for vba execution (non-standalone runtime)
6_constraints:
class: constraint
value: vba cannot be executed independent of excel
7_reproducibility:
class: reproducibility
value: not applicable to this workpaper; execution reproducibility addressed at dataset level workpapers
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 04: sha-256 execution method – vba within excel
id: 04.03___vba_environment_requirements___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: document environment setup to access vba for sha-256 computation
2_source:
class: provenance
value: this setup reflects what builder actually did
3_dependencies:
class: dependency
value: excel continues to offer vba capability; user has admin rights to make changes in OS
4_exclusions:
class: exclusion
value: expanded
what this WP explicitly does not cover
execution logic or steps
dataset-specific setup
file locations
output validation
performance characteristics
5_subject_matter:
class: substance
value: expanded
platform assumptions
operating system: windows
host application: microsoft excel
execution context: in-process (not standalone)
capability requirements
vba enabled
macro execution permitted
ability to load xlam / workbook modules
access to windows cng via bcrypt.dll
encoding invariant (important)
vba strings are utf-16le
this encoding is not configurable
sha-256 input bytes are derived from this encoding
6_constraints:
class: constraint
value: vba cannot be executed independent of excel; excel and vba implicitly encode in utf-16le
7_reproducibility:
class: reproducibility
value: reproducible on windows + excel if steps followed; execution outputs validated at dataset level
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 04: vba as sha-256 execution method (runs within excel)
OPEN id: 04.04___vba_script___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: document vba script used in sha256 computation
2_source:
class: provenance
value: gpt authored script
3_dependencies:
class: dependency
value: excel continues to offer vba capability
4_exclusions:
class: exclusion
value: expanded
no executable logic
no performance characteristics
no output validation
excel and vba implicitly encode in utf-16le
script must likewise compute sha256 using utf-16le
5_subject_matter:
class: substance
value: OPEN – WILL BE UPDATED WHEN FINALIZED
6_constraints:
class: constraint
value: vba cannot be executed independent of excel
7_reproducibility:
class: reproducibility
value: not applicable to this workpaper; execution reproducibility addressed at dataset level workpapers
8_notes:
class: note
value: this wp is NOT complete, need to test new code live when using control in reality
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 05: python as sha-256 execution method (runs external to excel)
id: 05.01___python_id___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: document python as an optional execution method used to compute sha-256 within c02
2_source:
class: provenance
value: python is an open-source programming language governed and maintained by the python software foundation(psf) and the global python community
3_dependencies:
class: dependency
value: python runtime installed and accessible on execution host
4_exclusions:
class: exclusion
value: wp is documenting python as a possible computing mechanism within c02 only; execution steps and dataset-specific python scripts are out of scope
5_subject_matter:
class: substance
value: expanded
python subject matter documented
python is a general-purpose programming language executed as a standalone runtime
python is primarily used for batch, file-based computation and data transformation external to excel
python scripts vary by dataset to accommodate input structure and preprocessing; the sha-256 computation logic is invariant across datasets
python scripts read input data from files or defined input streams to produce sha-256 output
builder uses python as an external sha256 computation method
sha-256 is computed using python standard libraries or system-provided cryptographic libraries
python-produced sha-256 values are generated as stored static values
python outputs are persisted as file-based artifacts, not interactive cell outputs
6_constraints:
class: constraint
value: constraints arise from execution environment and integration targets (e.g., file formats consumed by excel)
7_reproducibility:
class: reproducibility
value: not applicable to this workpaper; execution reproducibility addressed at dataset level workpapers
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 05: python as sha-256 execution method (runs external to excel)
id: 05.02___python_container___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: document that python has no inherent container; execution context is runtime + OS and outputs are persisted to files
2_source:
class: provenance
value: the python software foundation (psf)
3_dependencies:
class: dependency
value: python execution is external to excel; excel functions only as a downstream analysis and control surface; python continues to be available and supported by the python software foundation (psf)
4_exclusions:
class: exclusion
value: no python execution logic, performance characteristics, or output validation documented
5_subject_matter:
class: substance
value: expanded
document python has no inherent container; python used and data is published to external files
python is not exclusive to computing device (pc or mac)
python is not exclusive to windows or any particular OS
6_constraints:
class: constraint
value: python is a standalone runtime, but not an application container or interface by default
7_reproducibility:
class: reproducibility
value: not applicable to this workpaper; execution reproducibility addressed at dataset level workpapers
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 05: python as sha-256 execution method (runs external to excel)
id: 05.03___python_environment_requirements___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: document environment setup to access python for sha-256 computation
2_source:
class: provenance
value: this setup reflects what builder actually did
3_dependencies:
class: dependency
value: python software foundation (psf) continues to support python scripting language; user has admin rights to make changes in OS
4_exclusions:
class: exclusion
value: expanded
what this WP explicitly does not cover
execution logic or steps
dataset-specific setup
file locations
output validation
performance characteristics
5_subject_matter:
class: substance
value:
platform assumptions
operating system: os-agnostic (windows, macos, linux)
host application: none (standalone runtime)
execution context: external process (not in-process with excel)
capability requirements
python runtime installed (single installation; version documented once)
runtime reused across datasets unless explicitly changed
access to standard or approved cryptographic libraries
encoding decision (control-critical)
python scripts must explicitly define text encoding
sha-256 input bytes are derived from script-defined encoding (e.g., utf-8, utf-16le)
utf-16le is selected to align with excel/vba implicit encoding
encoding choice is a control decision and must remain consistent across datasets when comparability is required
python installation
python runtime version is fixed at installation
python does not auto-update via os updates
runtime version changes require explicit reinstallation and revalidation
version installed - included for informational purposes only – Python 3.13 (64-bit) – identified by searching for app
approximate date of installation – included for informational purposes only – August 14, 2025 (per python file properties)
6_constraints:
class: constraint
value: expanded
python is a standalone runtime with no inherent application container or user interface
python execution occurs external to excel
encoding alignment is control-critical:
- excel/vba implicitly encode strings as utf-16le
- python scripts MUST explicitly encode input text as utf-16le to produce comparable sha-256 outputs
- failure to align encoding will result in non-matching hashes for identical visible text
runtime constraints
python runtime version is fixed at installation
runtime does not auto-update via operating system updates
any runtime upgrade requires explicit reinstallation and revalidation
7_reproducibility:
class: reproducibility
value: expanded
reproducible provided:
identical python runtime major/minor version
identical script logic
identical explicit text encoding (utf-16le)
identical input data
python runtime updates do not occur automatically
reproducibility across time requires either:
preserving the original runtime version, or
revalidating outputs after a runtime upgrade
execution outputs are validated at the dataset level
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 05: python as sha-256 execution method (runs external to excel)
OPEN id: 05.04___python_script ___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
1_overview:
class: context
value: document vba script used in sha256 computation
2_source:
class: provenance
value: gpt authored script
3_dependencies:
class: dependency
value: python software foundation (psf) continues to support python scripting language; user has admin rights to make changes in OS
4_exclusions:
class: exclusion
value: expanded
what this WP explicitly does not cover
execution logic or steps
dataset-specific setup
file locations
output validation
performance characteristics
5_subject_matter:
class: substance
value: expanded
INSERT CODE HERE
6_constraints:
class: constraint
value:
7_reproducibility:
class: reproducibility
value: not applicable to this workpaper; execution reproducibility addressed at dataset level workpapers
8_notes:
class: note
value: this wp is NOT complete, need to test new code live when using control in reality
END WORKPAPER