top of page

02_control_data_integrity

WORKPAPERS ARE OPTIMIZED FOR MACHINE READABILITY

 

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 01: lists all workpapers within control 02
id:  01___workpaper_summary___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD

1_overview:
class: context
value: workpaper summary for control 02: data integrity control
 
2_source:
class: provenance
value: builder authored
 
3_dependencies:
class: dependency
value: not applicable, this is a workpaper summary
 
4_exclusions:
class: exclusion
value: not applicable, this is a workpaper summary
 
5_subject_matter:
class: substance
value: expanded section
workpapers (machine readable + human verifiable)
01.01___workpaper_summary___d    
02.01___control_overview___d           
02.02___control_container_excel___d           
02.03___control_example_excel___d 

03.01___sha256_concepts___d
03.02___sha256_considerations___d
03.03___sha256_source___d

04.01___vba_id___d
04.02___vba_container___d
04.03___vba_environment_requirements___d
04.04___vba_script___d

05.01___python_id___d
05.02___python_container___d
05.03___python_environment_requirements___d
05.04___python_script___d
 

6_constraints:
class: constraint
value: not applicable, control policy documentation only

7_reproducibility:
class: reproducibility
value: not applicable, controls subjectively implemented by builder
 
8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 02: control overview
id:  02.01___control_overview___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: overview of control 02: data integrity control
 
2_source:
class: provenance
value: builder designed control (no external standard asserted)
 
3_dependencies:
class: dependency
value: not applicable, this is a workpaper summary
 
4_exclusions:
class: exclusion
value: expanded section
control design subjectivity
control creation is subjectively designed by builder - another builder may choose a different control
errors introduced prior to static computation generation are outside scope 
      
correctness of source data
this control does not validate correctness of source data  
errors introduced prior to static computation generation are outside scope 
      
intentional manipulation
the control does not detect intentional data manipulation occurring prior to static computation generation
      
other human error scope
the control does not prevent or correct human error outside the specified control process
      
semantic meaning
this control does not detect semantic equivalence  
semantic equivalence = meaning of control data  
 
5_subject_matter:
class: substance
value: expanded section
assurances
if the control passes static and live computation results are equal at the evaluated scope
this provides assurance that no detectable change has occurred to the selected data since static computation was generated
this assurance applies only to the exact data selection encoding and boundaries used
this assurance does not extend beyond the evaluated scope  
this assurance does not assert correctness completeness or intent 
      
cell error/non-value handling
if a cell is #N/A, #VALUE!, #DIV/0!, etc. the cell return is addressed at the occurance level
no resolution occurs within this control workpaper  
resolution, if any, is done at the dataset level  
evaluation and resolution are out of scope   
      
change detection logic
change is detected when static and live computation results are not equal 
detection is binary (change = 1 / no change = 0)  
a value of 1 indicates the presence of change only; it does not quantify the number or magnitude of changes
no conclusion, diagnosis, or cause of change is interpreted within this logic 
      
change resolution
detected changes are addressed at the occurrence level  
no resolution occurs within this control   
evaluation and resolution are out of scope   
      
defined
control compares a static computation to a live computation for the purpose of detecting a change (if any)
changes detected by manually reviewing the change-indicator calculation cell(s), where:
static computation result = live computation result → 0  
static computation result ≠ live computation result → 1  
results are reviewed manually by the user   
data selection defined by user and varies   
computation is SHA-256    
      
execution
static computation is generated once and stored; live computation is continuously derived from current data
equality comparison is executed by formula; interpretation is performed by the user
detected differences and confirmations of no change are subject to manual review 
the control operates continuously and is not limited to a point-in-time execution 
      
implementation safeguards (optional)
optional safeguards may be applied at implementation time  
examples (non-exhaustive):    
- freezing control columns    
- protection against accidental formula overwrite  
- separate storage of static computation   
- table-based data selection    
safeguard selection and implementation may vary by:  
- database     
- dataset     
- execution environment (excel, vba, python)   
absence of safeguards does not affect:   
- change detection logic    
- computation correctness    
- sha-256 determinism      
      
input basis
control computes SHA-256 over raw cell values, not displayed or formatted values 
      
scope
the control applies to any user-defined data selection, including but not limited to:
single symbol(s), cell(s), column(s), row(s), table(s), or file-level outputs 
control tests output equivalence only   
the control operates on the selected data as a whole, not on semantic meaning 
scope is defined at execution time by the user or system configuration 
 
6_constraints:
class: constraint
value: expanded section
algorithm dependency
this control relies on deterministic computation, not on a specific algorithm 
the computation algorithm used by the control may change over time 
algorithm replacement does not alter the control logic or comparison method 
      
excel related dependencies
correct execution of the computation function  
integrity of the spreadsheet environment   
      
nature of manual review control
this control requires manual review   
manual processes are by nature subject to risk  
      
unintentional manipulation
the control is designed to detect accidental and unintended changes 
the control may surface intentional changes incidentally  
 
7_reproducibility:
class: reproducibility
value: not applicable, this is a control overview

8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 02: control overview
id:  02.02___control_container___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: documents microsoft excel as the execution, storage, and review container for control 
 
2_source:
class: provenance
value: microsoft office (commercial software product)

3_dependencies:
class: dependency
value: not applicable, workpaper is to document container not control itself

4_exclusions:
class: exclusion
value: not applicable, workpaper is to document container not control itself
 
5_subject_matter:
class: substance
value: expanded section
configuration considerations
excel behavior may vary based on:     
excel version      
regional settings      
calculation mode (automatic vs manual)    
workbook protection settings     
these factors may affect reproducibility and must be controlled or acknowledged by user  
       
container role
microsoft excel serves as the execution, storage, and review container for the control 02  
excel provides:      
a structured grid for data selection     
formula-based computation and comparison    
persistent storage of static computation results    
visual inspection of change-indicator outputs    
excel does not define the control logic or computation mechanism   
       
execution context
the control 02 executes within excel via worksheet formulas    
execution characteristics:      
static computation results are stored as fixed values in cells   
live computation results are continuously derived from current cell values   
equality comparison is executed by formula    
change-indicator output is visible immediately upon data modification   
no batch job or scheduled execution is required    
       
storage behavior
excel stores:      
source data values      
static computation results     
live computation formulas     
change-indicator formulas     
excel does not enforce immutability of stored values    
persistence relies on standard workbook save behavior      
       
user interaction and review
the user:      
defines data selection boundaries     
initiates static computation generation     
observes change-indicator outputs     
performs manual review when change is detected    
excel provides visibility but does not perform interpretation or resolution   

6_constraints:
class: constraint
value: not applicable, workpaper is to document container not control itself
 
7_reproducibility:
class: reproducibility
value: not applicable, workpaper is to document container not control itself

8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 02: control overview
id:  02.03___control_example_excel___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: provide an illustrative example of control 02
 
2_source:
class: provenance
value: builder-authored

3_dependencies:
class: dependency
value: not applicable, workpaper is illustrative only

4_exclusions:
class: exclusion
value: not applicable, workpaper is illustrative only
 
5_subject_matter:
class: substance
value:  expanded
example control
illustrates control at illustrative level only 
example demonstrates how control 02 detects change, not how sha256 is computed
formulas in "row 1" are descriptive not live excel formulas
 
control output rule:
Output 0 when static and live sha256 values are identical
Output 1 when static and live sha256 values differ
 
control illustrative example
illustrative data = hdkslHSKD
static sha-256 stored sha-256 computation result 
live sha-256=computed SHA-256 computation 
 
computation formula=sha256_text(hdksIHSKD)
static sha-256=e6386e29cc15b45e434e6c2316cecee7a79ac82ef068d1d390de52cf9472d960
live sha-256=e6386e29cc15b45e434e6c2316cecee7a79ac82ef068d1d390de52cf9472d960
control formula = IF(input=input,0,1)
=[static] = [live] = 0
=[static] ≠ [live] = 1
control result = 0 (two sha-256 computations agree
 
notes:
stored result execution method varies - i.e. vba in excel, python
there are multiple excel formulas for calculation of sha256 computation
another formula could be used to identify differences in sha256 static and live compuations
 
6_constraints:
class: constraint
value: not applicable, workpaper is illustrative only  


7_reproducibility:
class: reproducibility
value: not applicable, workpaper is illustrative only

8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section orientation: 03_workpapers___sha256_computation_independent_of_ execution_ method
workpaper id:  03.01___sha256_concepts___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: describe concepts related to sha-256 computation
 
2_source:
class: provenance
value: builder- authored

3_dependencies:
class: dependency
value: not applicable, concepts only - execution of control out of scope

4_exclusions:
class: exclusion
value: the internal mathematical operations that perform the sha-256 computation are out of scope for this documentation
 
5_subject_matter:
class: substance
value:  expanded section

concepts grouped functionally, not alphabetically

5.1 concept notes: not concept specific
descriptions are intentionally brief    
descriptions grouped by function not alphabetical   
descriptions are functional, not dictionary-based   
concepts included were subjectively chosen by builder with GPT assistance  
       
5.2 primitives: smallest units a system operates on directly
primitive      
primitive the smallest unit a system treats as indivisible for its intended purpose within that system’s abstraction boundary
may have meaning, but is not decomposed further within the system  
combined to build more complex structures, but are not broken down further within the system 
       
5.3 display primitives: what is converted to computer primitives
symbol      
a character (ie. A, 9, space, :, emoji, etc.)   
some are human readable, and some are not (whitespace, non-printing characters, etc.) 
       
symbols (sha-256)     
not processed directly by sha-256    
are interpreted only after encoding converts them into bytes  
       
data scope      
defined by explicit selection and boundary prior to computation by the end user  
can be a single character, an invisible character (spaces, line breaks, formatting), a word, a row, a table, a file, a folder, etc.
can include invisible characters (spaces, line breaks, formatting)  
       
5.4 computer primitives: what computers see/interpret
       
bit      
single binary value (0 or 1) — the smallest unit of data representation  
the smallest unit computers use to represent information  
       
byte      
a group of bits     
how computers group bits meaningfully   
one byte (grouping of 8) has 256 possible patterns (arrangements of 0s and 1s)  
       
byte pattern      
the sequence ordered by bits (0s and 1s)   
computers map representation by the arrangement of bits into bytes  
       
5.5 encoding: process of translating human readable symbols to computer readable bytes
encoding      
can be likened to translating - from symbol to bytes (groups of bits - 0s and 1s)  
encoding inputs: symbols identified by character set (Unicode code points or equivalent) prior to encoding 
the mapping of symbols to byte patterns   
       
encoding systems     
stable schemas or frameworks that map symbol(s) to bytes (patterns of bits)  
the same symbols is represented with different byte patterns by different schemas  
byte patterns must be stable within an encoding framework for results to be deterministic and comparable 
       
code unit      
the fixed-size unit of storage used by an encoding scheme to represent code points 
not a character or a symbol    
it is an intermediate representation between symbols and raw bytes  
ode units are defined by the encoding scheme, not by sha-256  
UTF-8 uses 8-bit code units; UTF-16 uses 16-bit code units  
one code point may map to one or more code units depending on the encoding  
       
UTF-8 (encoding)     
a type of byte mapping system where symbols (Unicode code points) are mapped to one or more 8-bit code units (8 bits - 1 byte) 
symbol-to-byte mapping is stable across all UTF-8 systems  
a given code point always maps to the same byte sequence under UTF-8  
the mapping is deterministic (identical inputs = identical byte patterns)   
the mapping is reversible (byte pattern to symbol and symbol to byte pattern)  
       
UTF-16 (encoding)     
a type of byte mapping system where symbols (Unicode code points) are mapped to one or two 16-bit code units (16 bites = 2 byte) (surrogate pairs for code points above U+FFFF) 
symbol-to-byte mapping is stable across all UTF-16 systems  
a given code point maps to the same code unit sequence under UTF-16  
the mapping is deterministic (identical inputs = identical byte patterns) and reversible 
the mapping is reversible (byte pattern to symbol and symbol to byte pattern)  
       
5.6 sha-256 algorithm input concepts
input      
whatever symbols are being delivered to sha-256 algorithm must be bytes first  
sha-256 does not know what encoding system is used, it recognizes bytes patterns only 
sha-256 cannot see symbols, words, or meaning—only byte sequences  
boundaries for byte input(s) are defined by user (look at word, row, columns a + b, entire table, etc.) 
       
input and encoding     
different encodings ⇒ different byte arrays ⇒ different sha-256 output  
Same encoding + same symbols ⇒ same bytes ⇒ same sha-256 output   
assumes same byte order and concatenation order   
possibility of collisions noted and deemed out of scope   
       
byte stream construction    
the order in which bytes are concatenated is significant  
the same bytes in a different order produce a different result  
e.g., concatenating columns A+B vs B+A produces different byte streams and different hashes 
       
5.7 sha-256 algorithm and computation concepts
algorithm      
a fixed set of rules for transforming input into output   
the rules do not change based on content   
only the input affects the result    
       
deterministic      
same inputs (same bytes) produce the same result   
change in input bytes produces a different result except for cryptographic collisions 
different encoding systems produce different results because they produce different byte patterns for the same symbol(s) 
       
sha-256 algorithm     
A fixed, deterministic algorithm that converts input bytes into a 32-byte output  
SHA = Secure Hash Algorithm    
256 bits fixed output (32 bytes × 8 bits per byte)   
if algorithm changes, it is no longer sha-256   
       
sha-256 computation     
execution of the sha-256 algorithm    
       
sha-256 determinism     
assumes a fixed encoding and consistent byte stream construction  
sha-256 stabilitysha-256 text specification has clarifications, but algorithm definition has not changed 
any change would create a different algorithm, not sha-256  
this stability is critical for compatibility, verification, and security  
       
SHA -256 reversibility     
raw output is non-reversible due to the many-to-one mapping from arbitrary-length inputs (effectively infinite possibilities) to a finite, fixed-length output space 
the non-reversibility of the output makes the algorithm one way  
       
5.8 sha-256 computation other factors (salting)
salting (explicitly excluded)      
salt = extra bytes deliberately added to the input before sha-256 computation  
no additional bytes are appended to input prior to sha-256 computation  
extra data (salt) would cause identical inputs to produce different outputs when salt differs 
       
salt is used to:     
prevent precomputed (rainbow table) attacks   
ensure identical passwords do not hash to the same value across different records 
increase resistance to precomputation attacks   
salt is good for passwords    
salt is bad for deterministic comparison controls and will not be used  
       
5.9  sha-256 computation other factors (keying)
keying (explicitly excluded)    
key = secret value combined with input during sha-256 computation  
key is applied deliberately before or during computation  
keyed sha-256 computation produces different outputs for identical inputs when keys differ 
       
how keying is used:     
authentication and message integrity (e.g., HMAC)   
ensures only parties with the key can reproduce or verify the hash  
used to prove authenticity, not just detect change   
       
why keying exists:     
prevents unauthorized hash forgery    
binds the hash result to possession of a secret   
protects against tampering in adversarial environments   
       
why keying is excluded here:    
deterministic comparison requires identical inputs → identical outputs  
secret keys would prevent independent precomputation   
control is not performing authentication or trust enforcement  
key management is out of scope    
       
5.10 control position - keying and/or salting   
no salt added     
no secret keys used     
sha-256 executed without key material   
output comparability preserved across environments   
       
5.11 sha-256 algorithm output concepts (machine readable)
output generalsha-256 takes bytes as input and produces bytes as output  
different encoding systems produce different byte patterns  
different byte input produces different byte output   
       
32 byte output the raw result produced by the sha-256 algorithm  
this output is always exactly 32 bytes (fixed output)   
32 bytes x 8 bits per byte = 256 bits as in sha-256   
raw output would be a grouping of 256 0s and 1s    
below is made up example of one 32 byte output - 32 sets of 8 bits (0s and 1s)  
01101010 00001001 11101010 11110110 00010101 11110010 10100101 11000111 
00011010 10110101 11101101 00001111 10001001 00101010 10101011 00100101 
01101010 00001001 11101010 11110110 00010101 11110010 10100101 11000111 
00011010 10110101 11101101 00001111 10001001 00101010 10101011 00100101 
       
5.12 representation change from 32 byte output to human readable (not done by sha-256)
problem with 32 byte output    
raw output difficult for human to read and store   
       
hexadecimal number system with a base of 16   
decimal -- base 10 -- digits 0-9    
hex -- 6 units -- a-f = 11-15    
       
64 character hexadecimal string    
a standard representation of the sha-256 output (after computation)  
the 32-byte output is converted into a 64-character hexadecimal string  
the conversion methodology is stable - same 32 byte patterns produce same 64 character hexadecimal strings
more human readable and easier to store   
each byte is represented by a combination of 2 characters  
32 bytes x 2 characters per representation = 64 characters   
the characters are referred to as hexadecimal because they are 0-9 and a-f specifically 
the hexadecimal representation is reversible - can be converted back into bytes  
fingerprint (raw)fixed 32 byte sha-256 output   
fingerprint (display)the 32 byte output is converted and displayed as a 64 hexadecimal character string 

6_constraints:
class: constraint
value:   expanded section
 
algorithm longevity
SHA-256 is widely supported and not considered broken 
future algorithms may supersede SHA-256 for some use cases 
this documentation reflects the current state of SHA-256 usage 
      
cryptographic collision risk
collision = two different byte inputs → same output  
note it’s mathematically possible, practically infeasible to find for sha-256 
accepted as outside the scope   
 
7_reproducibility:
class: reproducibility
value: concepts based on builder (subjective), not intended for reproducibility

8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 03: sha-256 computation independent of execution method
id:  03.01___sha256_concepts___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: 
 
2_source:
class: provenance
value:

3_dependencies:
class: dependency
value:

4_exclusions:
class: exclusion
value:
 
5_subject_matter:
class: substance
value:  
 
6_constraints:
class: constraint
value:   
 
7_reproducibility:
class: reproducibility
value:

8_notes:
class: note
value: 
END WORKPAPER

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 03: sha-256 computation independent of execution method
id:  03.02___sha256_considerations___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: describe concepts related to sha-256 computation
 
2_source:
class: provenance
value: builder- authored

3_dependencies:
class: dependency
value: expanded
3.1 algorithm / standard evolution
shared computation assumes consistent algorithm version and standard interpretation
future deprecation or revision of hashing standards may require coordinated update
mismatch may occur if one side migrates algorithms earlier than the other 
      
3.2 implicit defaults introduced by execution environment
definition     
an environment may apply defaults unless explicitly overridden 
      
examples (not exhaustive):   
newline normalization    
Unicode normalization defaults   
implicit type coercion    
      
why this is shared-computation-specific:  
shared computation assumes explicitness  
implicit behavior violates that assumption  
risk exists even with correct code   

4_exclusions:
class: exclusion
value: expanded
concepts grouped functionally, not alphabetically

incomplete constraint enumeration (identification of all possible constraints)
if static and live computations differ, root cause diagnosis is difficult because a required shared constraint may be missing or undocumented
      
diagnostic ambiguity (consequence risk, not cause)
definition     
a hash mismatch does not indicate which constraint failed 
further investigation is required   
why this is shared-computation-specific:  
shared computation collapses many dimensions into one output 
the control is binary, diagnosis is not  
      
human interpretation of “shared”
definition - “shared computation” may be misinterpreted as “same intent” rather than “same rules”
why this is shared-computation-specific:  
future builders may assume equivalence without verification 
creates false confidence 

 5_subject_matter:
class: substance
value:  expanded
concepts grouped functionally, not alphabetically

5.1 constraint notes: scope & intent
constraints define conditions required for valid comparison  
    listed constraints are functional, not exhaustive   
    absence of a listed constraint does not imply it is permitted  
    implementation details may vary by environment   
       
5.2 inputs (what is included)
inputs are explicitly defined; unspecified inputs are not included  
       
scope boundary definition    
- exact cells included     
- order of inputs preserved    
- inclusion and exclusion rules explicitly defined   
       
input value source     
- raw cell values used     
- displayed or formatted values excluded   
       
empty vs null policy     
- blank cell handling explicitly defined   
- distinction between empty ("") and null (if applicable) explicitly stated  
       
data type coercion rules     
- numbers, dates, booleans coerced to text using defined rules  
- formatting rules explicitly stated    
- implicit type inference not applied    
       
5.3 transformations (how symbols become bytes)
transformations are deterministic and explicitly defined; unspecified transformations are not applied 
       
encoding mandate     
UTF-16LE      
UTF = Unicode Transformation Format   
unicode = a universal catalog of characters (letters, numbers, symbols)  
transformation format = rules for converting Unicode characters into bytes  
LE = little endian      
referring to the byte order    
when a value uses more than one byte, the system must define byte order:
least significant byte first (little endian) or most significant byte first (big endian)
applied uniformly to entire selected data scope   
       
no BOM unless explicitly specified    
BOM = byte order mark     
BOM = a small sequence of bytes placed at the start of a byte stream  
purpose of BOM is to signal what encoding is being used and/or signal byte order  
BOM bytes, if present, are included in the byte stream and therefore affect SHA-256 output 
       
Unicode normalization     
- normalization form: explicitly stated   
- NFC / NFD / NFKC / NFKD or none   
- normalization occurs before encoding   
- absence of normalization = input code point sequence preserved as-is  
       
case-folding policy     
- none (unless explicitly applied)    
- uppercase/lowercase transformations are out of scope unless specified  
       
implicite exclusions     
- no locale-based transformations    
- no language-aware processing    
- no semantic interpretation    
       
byte stream construction (order + separators)
concatenation order     
- explicitly defined    
- order is significant (A+B ≠ B+A)   
- row / column / table order preserved as selected  
- no implicit reordering    
delimiter policy     
- explicitly defined    
- delimiter value stated or none   
- delimiter escaping rules explicitly stated or none  
whitespace handling     
- explicitly defined    
- spaces, tabs, line breaks included or excluded as stated  
- trimming applied or not applied as stated   
newline canonicalization    
- explicitly defined    
- LF or CRLF     
- mixed newline handling explicitly stated or none  
BOM / null terminator policy    
- explicitly defined    
- present or absent    
- no implicit insertion or removal   
       
serialization and range ordering rules
plain language explanation of difference:   
Serialization = how each brick is made   
Range ordering = the order the bricks are stacked  
serialization rules: how individual values and structures are converted into bytes
(types, delimiters, whitespace, encoding, normalization, etc.)
- serialization = rules that convert structured selection → linear byte stream input 
- identical selection + different serialization → different byte streams → different sha-256 output 
- serialization rules define:   
value representation (text vs number vs date vs boolean)  
empty vs null policy    
delimiter policy (none vs separator; escape rules)  
whitespace handling (trim vs preserve; tabs/spaces)  
newline canonicalization (lf vs crlf)   
order policy (row order, column order, concatenation order)  
bom / terminator policy (present/absent)  
- mismatch in serialization rules invalidates comparison  
- comparison validity requires: same selection + same encoding + same serialization 
range ordering rules:  the sequence in which serialized values are concatenated into the byte stream (row-major vs column-major, multi-area order, merged cells, traversal order)
selection unit: cell-by-cell (not “values only”)   
traversal order: row-major (top→bottom, left→right)   
within-row order: increasing column index   
within-column progression: increasing row index   
multi-area ranges (discontiguous): process areas in order, then row-major within each area 
tables / structured refs: resolve to the underlying cell range, then apply the same rule  
entire row/column selections: define the bounded range explicitly (or forbid them)  
merged cells: define whether you read the top-left cell only or every address in the merge (pick one and state it) 
       
comparison method (what exactly is compared)
comparison method (what exactly is compared)   
- comparison is performed on SHA-256 computation outputs  
- comparison evaluates output equivalence only  
identical outputs → no change   
different outputs → change detected   
- output representation    
comparison uses raw SHA-256 output values   
- if displayed values are compared:   
hexadecimal representation must be consistent  
hex character case explicitly defined (lowercase vs uppercase)  
no spacing or formatting differences permitted  
exclusions     
- no locale-dependent interpretation   
decimal separators    
date formats    
thousands separators   
- no semantic interpretation of values   
- no tolerance or fuzzy matching   
comparison basis     
- exact value equality    
- byte-for-byte equivalence after computation and representation rules are applied 
 
6_constraints:
class: constraint
value:   expanded
constraint drift across implementations
static and live implementations may evolve independently
constraint adherence may diverge over time
mismatch may appear only at comparison time
 
shared-computation limitations
both sides can be “correct” in isolation
comparison validity requires identical rule application
 
risk statements (valid as constraints, not notes)
risk exists even with correct code
false confidence possible without verification
 
7_reproducibility:
class: reproducibility
value: lists considerations that impact comparability and reproducibility

8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 03: sha-256 computation independent of execution method
id:  03.03___sha256_source___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: documents authoritative source of the sha-256 algorithm
 
2_source:
class: provenance
value: see section 5 for source of sha-256

3_dependencies:
class: dependency
value: not applicable, sha-256 source identification only

4_exclusions:
class: exclusion
value: not applicable, sha-256 source identification only
 
5_subject_matter:
class: substance
value:  expanded
source sha-256                        
national security agency (nsa) designed sha-2 (includes sha-256)                        
national institute of standards and technology (nist) standardized and published sha-256    
nist:: publisher and maintainer of u.s. cryptographic standards (including fips)        
nist publishes standards used by governments, companies, and internet at large                        
standard on sha-256 is fips 180-4 (first released: 2001)                        
fips = federal information processing standard                        
                       
fips 180-4                        
fips 180-4 defines the sha-2 family, including sha-256                        
Input preprocessing                        
Padding rules                        
Bitwise operations                        
Constants                        
Output format                        
document maintained at:  https://csrc.nist.gov/pubs/fips/180-4/upd1/final                        
changes to fips 180-4                        
the document is actively maintained and updated by nist                        
sha-256 algorithm/computation does not change                        
the standard (containing documentation on sha-256 computation) can be updated                
common uses                        
data integrity verification (file checksums)                        
password hashing (often combined with salt and key stretching)                        
digital signatures                        
blockchain & cryptocurrencies (e.g., Bitcoin mining)                        
secure authentication systems            

6_constraints:
class: constraint
value:   nist may change fips 180-4 standard language (algorithm computation remains stable)
 
7_reproducibility:
class: reproducibility
value: reproducibility not applicable, purpose is source identification

8_notes:
class: note
value: nist may change fips 180-4 website location at any time
END WORKPAPER

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 04: vba as sha-256 execution method (runs within excel)
id:  04.01___vba_id___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: document visual basic for applications (vba) as an optional execution method used to compute sha-256 within di01
 
2_source:
class: provenance
value: vba is embedded within microsoft excel and executes in the excel application context (not a standalone runtime)

3_dependencies:
class: dependency
value: vba remains accessible through excel

4_exclusions:
class: exclusion
value:  wp is documenting vba as possible computing mechanism within di01 only, execution and specific vba to compute sha-256 out of scope of workpaper
 
5_subject_matter:
class: substance
value:  expanded
vba subject matter documented                    
vba is a programming language available within excel                    
sha-256 computed via vba or system libraries                    
external library invocation permitted                    
vba-produced sha-256 value may be generated as a live output and then stored as the static baseline by copying values                    
 
6_constraints:
class: constraint
value:   not applicable, identification of execution method only
 
7_reproducibility:
class: reproducibility
value: not applicable to this workpaper; execution reproducibility addressed at dataset level workpapers

8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 04: vba as sha-256 execution method (runs within excel)
id:  04.02___vba_container___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: document excel as the container for visual basic for applications (vba)
 
2_source:
class: provenance
value: microsoft office excel (vendor-provided software)

3_dependencies:
class: dependency
value: excel continues to offer vba capability

4_exclusions:
class: exclusion
value:  no vba execution logic, performance characteristics, or output validation documented
 
5_subject_matter:
class: substance
value:  excel as container for vba execution (non-standalone runtime)                
 
6_constraints:
class: constraint
value:  vba cannot be executed independent of excel
 
7_reproducibility:
class: reproducibility
value: not applicable to this workpaper; execution reproducibility addressed at dataset level workpapers

8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 04: sha-256 execution method – vba within excel
id:  04.03___vba_environment_requirements___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: document environment setup to access vba for sha-256 computation
 
2_source:
class: provenance
value: this setup reflects what builder actually did

3_dependencies:
class: dependency
value: excel continues to offer vba capability; user has admin rights to make changes in OS

4_exclusions:
class: exclusion
value:  expanded
what this WP explicitly does not cover
execution logic or steps
dataset-specific setup
file locations
output validation
performance characteristics
 
5_subject_matter:
class: substance
value: expanded
platform assumptions
operating system: windows
host application: microsoft excel
execution context: in-process (not standalone)

capability requirements
vba enabled
macro execution permitted
ability to load xlam / workbook modules
access to windows cng via bcrypt.dll

encoding invariant (important)
vba strings are utf-16le
this encoding is not configurable
sha-256 input bytes are derived from this encoding
 
6_constraints:
class: constraint
value: vba cannot be executed independent of excel; excel and vba implicitly encode in utf-16le
 
7_reproducibility:
class: reproducibility
value: reproducible on windows + excel if steps followed; execution outputs validated at dataset level

8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 04: vba as sha-256 execution method (runs within excel)
OPEN id: 04.04___vba_script___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: document vba script used in sha256 computation
 
2_source:
class: provenance
value: gpt authored script

3_dependencies:
class: dependency
value: excel continues to offer vba capability

4_exclusions:
class: exclusion
value:  expanded
no executable logic
no performance characteristics
no output validation
excel and vba implicitly encode in utf-16le
script must likewise compute sha256 using utf-16le
 
5_subject_matter:
class: substance
value: OPEN – WILL BE UPDATED WHEN FINALIZED
 
6_constraints:
class: constraint
value: vba cannot be executed independent of excel
 
7_reproducibility:
class: reproducibility
value: not applicable to this workpaper; execution reproducibility addressed at dataset level workpapers

8_notes:
class: note
value: this wp is NOT complete, need to test new code live when using control in reality
END WORKPAPER

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 05: python as sha-256 execution method (runs external to excel)  
id:  05.01___python_id___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: document python as an optional execution method used to compute sha-256 within c02
 
2_source:
class: provenance
value: python is an open-source programming language governed and maintained by the python software foundation(psf) and the global python community

3_dependencies:
class: dependency
value: python runtime installed and accessible on execution host

4_exclusions:
class: exclusion
value:  wp is documenting python as a possible computing mechanism within c02 only; execution steps and dataset-specific python scripts are out of scope
 
5_subject_matter:
class: substance
value: expanded
python subject matter documented                    
python is a general-purpose programming language executed as a standalone runtime    
python is primarily used for batch, file-based computation and data transformation external to excel                    
python scripts vary by dataset to accommodate input structure and preprocessing; the sha-256 computation logic is invariant across datasets                    
python scripts read input data from files or defined input streams to produce sha-256 output
builder uses python as an external sha256 computation method                    
sha-256 is computed using python standard libraries or system-provided cryptographic libraries
python-produced sha-256 values are generated as stored static values            
python outputs are persisted as file-based artifacts, not interactive cell outputs            
 
6_constraints:
class: constraint
value: constraints arise from execution environment and integration targets (e.g., file formats consumed by excel)
 
7_reproducibility:
class: reproducibility
value: not applicable to this workpaper; execution reproducibility addressed at dataset level workpapers

8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 05: python as sha-256 execution method (runs external to excel)
id:  05.02___python_container___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: document that python has no inherent container; execution context is runtime + OS and outputs are persisted to files
 
2_source:
class: provenance
value: the python software foundation (psf)

3_dependencies:
class: dependency
value: python execution is external to excel; excel functions only as a downstream analysis and control surface; python continues to be available and supported by the python software foundation (psf)

4_exclusions:
class: exclusion
value:  no python execution logic, performance characteristics, or output validation documented
 
5_subject_matter:
class: substance
value: expanded
document python has no inherent container; python used and data is published to external files
python is not exclusive to computing device (pc or mac)
python is not exclusive to windows or any particular OS
 
6_constraints:
class: constraint
value: python is a standalone runtime, but not an application container or interface by default
 
7_reproducibility:
class: reproducibility
value: not applicable to this workpaper; execution reproducibility addressed at dataset level workpapers

8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 05: python as sha-256 execution method (runs external to excel)
id:  05.03___python_environment_requirements___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: document environment setup to access python for sha-256 computation
 
2_source:
class: provenance
value: this setup reflects what builder actually did

3_dependencies:
class: dependency
value: python software foundation (psf) continues to support python scripting language; user has admin rights to make changes in OS

4_exclusions:
class: exclusion
value:  expanded
what this WP explicitly does not cover
execution logic or steps
dataset-specific setup
file locations
output validation
performance characteristics
 
5_subject_matter:
class: substance
value:
platform assumptions
operating system: os-agnostic (windows, macos, linux)
host application: none (standalone runtime)
execution context: external process (not in-process with excel)

capability requirements
python runtime installed (single installation; version documented once)
runtime reused across datasets unless explicitly changed
access to standard or approved cryptographic libraries

encoding decision (control-critical)
python scripts must explicitly define text encoding
sha-256 input bytes are derived from script-defined encoding (e.g., utf-8, utf-16le)
utf-16le is selected to align with excel/vba implicit encoding
encoding choice is a control decision and must remain consistent across datasets when comparability is required

python installation
python runtime version is fixed at installation
python does not auto-update via os updates
runtime version changes require explicit reinstallation and revalidation

version installed - included for informational purposes only – Python 3.13 (64-bit) – identified by searching for app

approximate date of installation – included for informational purposes only – August 14, 2025 (per python file properties)
 
6_constraints:
class: constraint
value: expanded
python is a standalone runtime with no inherent application container or user interface

python execution occurs external to excel

encoding alignment is control-critical:
- excel/vba implicitly encode strings as utf-16le
- python scripts MUST explicitly encode input text as utf-16le to produce comparable sha-256 outputs
- failure to align encoding will result in non-matching hashes for identical visible text

runtime constraints
python runtime version is fixed at installation
runtime does not auto-update via operating system updates
any runtime upgrade requires explicit reinstallation and revalidation
 
7_reproducibility:
class: reproducibility
value: expanded
reproducible provided:
identical python runtime major/minor version
identical script logic
identical explicit text encoding (utf-16le)
identical input data
python runtime updates do not occur automatically
reproducibility across time requires either:
preserving the original runtime version, or
revalidating outputs after a runtime upgrade
execution outputs are validated at the dataset level

8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control 02: data integrity comparative computation to detect change(s) in data
section 05: python as sha-256 execution method (runs external to excel)
OPEN id:  05.04___python_script ___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
 
1_overview:
class: context
value: document vba script used in sha256 computation
 
2_source:
class: provenance
value: gpt authored script

3_dependencies:
class: dependency
value: python software foundation (psf) continues to support python scripting language; user has admin rights to make changes in OS

4_exclusions:
class: exclusion
value:  expanded
what this WP explicitly does not cover
execution logic or steps
dataset-specific setup
file locations
output validation
performance characteristics
 
5_subject_matter:
class: substance
value: expanded
INSERT CODE HERE

6_constraints:
class: constraint
value: 
 
7_reproducibility:
class: reproducibility
value: not applicable to this workpaper; execution reproducibility addressed at dataset level workpapers

8_notes:
class: note
value: this wp is NOT complete, need to test new code live when using control in reality
END WORKPAPER

bottom of page