top of page

02_control_data_integrity

​​​BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 01_workpapers___list_of_workpapers_in_c02
workpaper id:  01___workpaper_summary___d
first_published: YYYY-MM-DD
last_updated: YYYY-MM-DD

​ 
1_overview:
class: context
value: workpaper summary for control 02: data integrity control
 
2_source:
class: provenance
value: builder authored
 
3_dependencies:
class: dependency
value: not applicable, this is a workpaper summary
 
4_exclusions:
class: exclusion
value: not applicable, this is a workpaper summary
 
5_subject_matter:
class: substance
value: expanded section
workpapers (machine readable + human verifiable)
01.01___workpaper_summary___d    
    
02.02___control_overview___d           
02.02___control_container_excel___d           
02.03___control_example_excel___d 
​
03.01___sha256_concepts___d
03.02___sha256_considerations___d
03.03___sha256_source___d
​
04.01___vba_id___d
04.02___vba_container___d
04.03___vba_environment_requirements___d
04.04___vba_script___d
​
05.01___python_id___d
05.01___python_id___d
05.03___python_environment_requirements___d
05.04___python_script___d
 
appendixes (human aids, machine ignore) 
a.01___vba_set_up_notes___a
a.02___python_background_notes___a
​
flowcharts (human aids, machine ignore)
f.01___control_overview___fc
f.02___data_transformation_simple__fc
f.03___data_transformation_expanded__fc
f.04___data_reversibility_boudary-map___fc
f.05___vba_flow___fc
​
6_constraints:
class: constraint
value: not applicable, control policy documentation only
​
7_reproducibility:
class: reproducibility
value: not applicable, controls subjectively implemented by builder
 
8_notes:
class: note
value: none
END WORKPAPER
​
BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 02_workpapers___control_overview
workpaper id:  02.01___control_overview___D
first_published: YYYY-MM-DD
last_updated: YYYY-MM-DD

​ 
1_overview:
class: context
value: overview of control 02: data integrity control
 
2_source:
class: provenance
value: builder designed control (no external standard asserted)
 
3_dependencies:
class: dependency
value: not applicable, this is a workpaper summary
 
4_exclusions:
class: exclusion
value: expanded section
control design subjectivity
control creation is subjectively designed by builder - another builder may choose a different control
errors introduced prior to static computation generation are outside scope 
      
correctness of source data
this control does not validate correctness of source data  
errors introduced prior to static computation generation are outside scope 
      
intentional manipulation
the control does not detect intentional data manipulation occurring prior to static computation generation
      
other human error scope
the control does not prevent or correct human error outside the specified control process
      
semantic meaning
this control does not detect semantic equivalence  
semantic equivalence = meaning of control data  
 
5_subject_matter:
class: substance
value: expanded section
assurances
if the control passes static and live computation results are equal at the evaluated scope
this provides assurance that no detectable change has occurred to the selected data since static computation was generated
this assurance applies only to the exact data selection encoding and boundaries used
this assurance does not extend beyond the evaluated scope  
this assurance does not assert correctness completeness or intent 
      
cell error/non-value handling
if a cell is #N/A, #VALUE!, #DIV/0!, etc. the cell return is addressed at the occurance level
no resolution occurs within this control workpaper  
resolution, if any, is done at the dataset level  
evaluation and resolution are out of scope   
      
change detection logic
change is detected when static and live computation results are not equal 
detection is binary (change = 1 / no change = 0)  
a value of 1 indicates the presence of change only; it does not quantify the number or magnitude of changes
no conclusion, diagnosis, or cause of change is interpreted within this logic 
      
change resolution
detected changes are addressed at the occurrence level  
no resolution occurs within this control   
evaluation and resolution are out of scope   
      
defined
control compares a static computation to a live computation for the purpose of detecting a change (if any)
changes detected by manually reviewing the change-indicator calculation cell(s), where:
static computation result = live computation result → 0  
static computation result ≠ live computation result → 1  
results are reviewed manually by the user   
data selection defined by user and varies   
computation is SHA-256    
      
execution
static computation is generated once and stored; live computation is continuously derived from current data
equality comparison is executed by formula; interpretation is performed by the user
detected differences and confirmations of no change are subject to manual review 
the control operates continuously and is not limited to a point-in-time execution 
      
implementation safeguards (optional)
optional safeguards may be applied at implementation time  
examples (non-exhaustive):    
- freezing control columns    
- protection against accidental formula overwrite  
- separate storage of static computation   
- table-based data selection    
safeguard selection and implementation may vary by:  
- database     
- dataset     
- execution environment (excel, vba, python)   
absence of safeguards does not affect:   
- change detection logic    
- computation correctness    
- sha-256 determinism      
      
input basis
control computes SHA-256 over raw cell values, not displayed or formatted values 
      
scope
the control applies to any user-defined data selection, including but not limited to:
single symbol(s), cell(s), column(s), row(s), table(s), or file-level outputs 
control tests output equivalence only   
the control operates on the selected data as a whole, not on semantic meaning 
scope is defined at execution time by the user or system configuration 
 
​6_constraints:
class: constraint
value: expanded section
algorithm dependency
this control relies on deterministic computation, not on a specific algorithm 
the computation algorithm used by the control may change over time 
algorithm replacement does not alter the control logic or comparison method 
      
excel related dependencies
correct execution of the computation function  
integrity of the spreadsheet environment   
      
nature of manual review control
this control requires manual review   
manual processes are by nature subject to risk  
      
unintentional manipulatiom
the control is designed to detect accidental and unintended changes 
the control may surface intentional changes incidentally  
 
7_reproducibility:
class: reproducibility
value: not applicable, this is a control overview
​
8_notes:
class: note
value: none
END WORKPAPER​​
​
BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 02_workpapers___control_overview
workpaper id:  02.02___control_container___D
first_published: YYYY-MM-DD
last_updated: YYYY-MM-DD

​ 
1_overview:
class: context
value: documents microsoft excel as the execution, storage, and review container for control 
 
2_source:
class: provenance
value: microsoft office (commercial software product)
​
3_dependencies:
class: dependency
value: not applicable, workpaper is to document container not control itself

4_exclusions:
class: exclusion
value: not applicable, workpaper is to document container not control itself
 
5_subject_matter:
class: substance
value: expanded section
configuration considerations
excel behavior may vary based on:     
excel version      
regional settings      
calculation mode (automatic vs manual)    
workbook protection settings     
these factors may affect reproducibility and must be controlled or acknowledged by user  
       
container role
microsoft excel serves as the execution, storage, and review container for the di01 control  
excel provides:      
a structured grid for data selection     
formula-based computation and comparison    
persistent storage of static computation results    
visual inspection of change-indicator outputs    
excel does not define the control logic or computation mechanism   
       
execution context
the DI01 control executes within excel via worksheet formulas    
execution characteristics:      
static computation results are stored as fixed values in cells   
live computation results are continuously derived from current cell values   
equality comparison is executed by formula    
change-indicator output is visible immediately upon data modification   
no batch job or scheduled execution is required    
       
storage behavior
excel stores:      
source data values      
static computation results     
live computation formulas     
change-indicator formulas     
excel does not enforce immutability of stored values    
persistence relies on standard workbook save behavior      
       
user interaction and review
the user:      
defines data selection boundaries     
initiates static computation generation     
observes change-indicator outputs     
performs manual review when change is detected    
excel provides visibility but does not perform interpretation or resolution   

​6_constraints:
class: constraint
value: not applicable, workpaper is to document container not control itself
 
7_reproducibility:
class: reproducibility
value: not applicable, workpaper is to document container not control itself
​
8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 02_workpapers___control_overview
workpaper id:  02.03___control_example_excel___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD

​ 
1_overview:
class: context
value: provide an illustrative example of control 02
 
2_source:
class: provenance
value: builder-authored
​
3_dependencies:
class: dependency
value: not applicable, workpaper is illustrative only
​
4_exclusions:
class: exclusion
value: not applicable, workpaper is illustrative only
 
5_subject_matter:
class: substance
value:  expanded
example control
illustrates control at illustrative level only 
example demonstrates how control 02 detects change, not how sha256 is computed
formulas in "row 1" are descriptive not live excel formulas
 
control output rule:
• Output 0 when static and live sha256 values are identical
• Output 1 when static and live sha256 values differ
 
control illustrative example
illustrative data = hdkslHSKD
static sha-256 stored sha-256 computation result 
live sha-256=computeded SHA-256 computation 
 
computation formula=sha256_text(hdksIHSKD)
static sha-256=e6386e29cc15b45e434e6c2316cecee7a79ac82ef068d1d390de52cf9472d960
live sha-256=e6386e29cc15b45e434e6c2316cecee7a79ac82ef068d1d390de52cf9472d960
control formula = IF(input=input,0,1)
=[static] = [live] = 0
=[static] ≠ [live] = 1
control result = 0 (two sha-256 computations agree
 
notes:
stored result execution method varies - i.e. vba in excel, python
there are multiple excel formulas for calculation of sha256 computation
another formula could be used to identify differences in sha256 static and live compuations
 
​6_constraints:
class: constraint
value: not applicable, workpaper is illustrative only  


7_reproducibility:
class: reproducibility
value:not applicable, workpaper is illustrative only
​
8_notes:
class: note
value: none
END WORKPAPER


 

BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 03_workpapers___sha256_computation_independent_of_ execution_ method
workpaper id:  03.01___sha256_concepts___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
​ 
1_overview:
class: context
value: describe concepts related to sha-256 computation
 
2_source:
class: provenance
value: builder- authored
​
3_dependencies:
class: dependency
value: not applicable, concepts only - execution of control out of scope
​
4_exclusions:
class: exclusion
value: the internal mathematical operations that perform the sha-256 computation are out of scope for this documentation
 
5_subject_matter:
class: substance
value:  expanded section

concepts grouped functionally, not alphabetically


5.1 concept notes: not concept specific

descriptions are intentionally brief    

descriptions grouped by function not alphabetical   

descriptions are functional, not dictionary-based   

concepts included were subjectively chosen by builder with GPT assistance  

       

5.2 primitives: smallest units a system operates on directly

primitive      

primitive the smallest unit a system treats as indivisible for its intended purpose within that system’s abstraction boundary

may have meaning, but is not decomposed further within the system  

combined to build more complex structures, but are not broken down further within the system 

       

5.3 display primitives: what is converted to computer primitives

symbol      

a character (ie. A, 9, space, :, emoji, etc.)   

some are human readable, and some are not (whitespace, non-printing characters, etc.) 

       

symbols (sha-256)     

not processed directly by sha-256    

are interpreted only after encoding converts them into bytes  

       

data scope      

defined by explicit selection and boundary prior to computation by the end user  

can be a single character, an invisible character (spaces, line breaks, formatting), a word, a row, a table, a file, a folder, etc.

can include invisible characters (spaces, line breaks, formatting)  

       

5.4 computer primitives: what computers see/interpret

       

bit      

single binary value (0 or 1) — the smallest unit of data representation  

the smallest unit computers use to represent information  

       

byte      

a group of bits     

how computers group bits meaningfully   

one byte (grouping of 8) has 256 possible patterns (arrangements of 0s and 1s)  

       

byte pattern      

the sequence ordered by bits (0s and 1s)   

computers map representation by the arrangement of bits into bytes  

       

5.5 encoding: process of translating human readable symbols to computer readable bytes

encoding      

can be likened to translating - from symbol to bytes (groups of bits - 0s and 1s)  

encoding inputs: symbols identified by character set (Unicode code points or equivalent) prior to encoding 

the mapping of symbols to byte patterns   

       

encoding systems     

stable schemas or frameworks that map symbol(s) to bytes (patterns of bits)  

the same symbols is represented with different byte patterns by different schemas  

byte patterns must be stable within an encoding framework for results to be deterministic and comparable 

       

code unit      

the fixed-size unit of storage used by an encoding scheme to represent code points 

not a character or a symbol    

it is an intermediate representation between symbols and raw bytes  

ode units are defined by the encoding scheme, not by sha-256  

UTF-8 uses 8-bit code units; UTF-16 uses 16-bit code units  

one code point may map to one or more code units depending on the encoding  

       

UTF-8 (encoding)     

a type of byte mapping system where symbols (Unicode code points) are mapped to one or more 8-bit code units (8 bits - 1 byte) 

symbol-to-byte mapping is stable across all UTF-8 systems  

a given code point always maps to the same byte sequence under UTF-8  

the mapping is deterministic (identical inputs = identical byte patterns)   

the mapping is reversible (byte pattern to symbol and symbol to byte pattern)  

       

UTF-16 (encoding)     

a type of byte mapping system where symbols (Unicode code points) are mapped to one or two 16-bit code units (16 bites = 2 byte) (surrogate pairs for code points above U+FFFF) 

symbol-to-byte mapping is stable across all UTF-16 systems  

a given code point maps to the same code unit sequence under UTF-16  

the mapping is deterministic (identical inputs = identical byte patterns) and reversible 

the mapping is reversible (byte pattern to symbol and symbol to byte pattern)  

       

5.6 sha-256 algorithm input concepts

input      

whatever symbols are being delivered to sha-256 algorithm must be bytes first  

sha-256 does not know what encoding system is used, it recognizes bytes patterns only 

sha-256 cannot see symbols, words, or meaning—only byte sequences  

boundaries for byte input(s) are defined by user (look at word, row, columns a + b, entire table, etc.) 

       

input and encoding     

different encodings ⇒ different byte arrays ⇒ different sha-256 output  

Same encoding + same symbols ⇒ same bytes ⇒ same sha-256 output   

assumes same byte order and concatenation order   

possibility of collisions noted and deemed out of scope   

       

byte stream construction    

the order in which bytes are concatenated is significant  

the same bytes in a different order produce a different result  

e.g., concatenating columns A+B vs B+A produces different byte streams and different hashes 

       

5.7 sha-256 algorithm and computation concepts

algorithm      

a fixed set of rules for transforming input into output   

the rules do not change based on content   

only the input affects the result    

       

deterministic      

same inputs (same bytes) produce the same result   

change in input bytes produces a different result except for cryptographic collisions 

different encoding systems produce different results because they produce different byte patterns for the same symbol(s) 

       

sha-256 algorithm     

A fixed, deterministic algorithm that converts input bytes into a 32-byte output  

SHA = Secure Hash Algorithm    

256 bits fixed output (32 bytes × 8 bits per byte)   

if algorithm changes, it is no longer sha-256   

       

sha-256 computation     

execution of the sha-256 algorithm    

       

sha-256 determinism     

assumes a fixed encoding and consistent byte stream construction  

sha-256 stabilitysha-256 text specification has clarifications, but algorithm definition has not changed 

any change would create a different algorithm, not sha-256  

this stability is critical for compatibility, verification, and security  

       

SHA -256 reversibility     

raw output is non-reversible due to the many-to-one mapping from arbitrary-length inputs (effectively infinite possibilities) to a finite, fixed-length output space 

the non-reversibility of the output makes the algorithm one way  

       

5.8 sha-256 computation other factors (salting)

salting (explicitly excluded)      

salt = extra bytes deliberately added to the input before sha-256 computation  

no additional bytes are appended to input prior to sha-256 computation  

extra data (salt) would cause identical inputs to produce different outputs when salt differs 

       

salt is used to:     

prevent precomputed (rainbow table) attacks   

ensure identical passwords do not hash to the same value across different records 

increase resistance to precomputation attacks   

salt is good for passwords    

salt is bad for deterministic comparison controls and will not be used  

       

5.9  sha-256 computation other factors (keying)

keying (explicitly excluded)    

key = secret value combined with input during sha-256 computation  

key is applied deliberately before or during computation  

keyed sha-256 computation produces different outputs for identical inputs when keys differ 

       

how keying is used:     

authentication and message integrity (e.g., HMAC)   

ensures only parties with the key can reproduce or verify the hash  

used to prove authenticity, not just detect change   

       

why keying exists:     

prevents unauthorized hash forgery    

binds the hash result to possession of a secret   

protects against tampering in adversarial environments   

       

why keying is excluded here:    

deterministic comparison requires identical inputs → identical outputs  

secret keys would prevent independent precomputation   

control is not performing authentication or trust enforcement  

key management is out of scope    

       

5.10 control position - keying and/or salting   

no salt added     

no secret keys used     

sha-256 executed without key material   

output comparability preserved across environments   

       

5.11 sha-256 algorithm output concepts (machine readable)

output generalsha-256 takes bytes as input and produces bytes as output  

different encoding systems produce different byte patterns  

different byte input produces different byte output   

       

32 byte output the raw result produced by the sha-256 algorithm  

this output is always exactly 32 bytes (fixed output)   

32 bytes x 8 bits per byte = 256 bits as in sha-256   

raw output would be a grouping of 256 0s and 1s    

below is made up example of one 32 byte output - 32 sets of 8 bits (0s and 1s)  

01101010 00001001 11101010 11110110 00010101 11110010 10100101 11000111 

00011010 10110101 11101101 00001111 10001001 00101010 10101011 00100101 

01101010 00001001 11101010 11110110 00010101 11110010 10100101 11000111 

00011010 10110101 11101101 00001111 10001001 00101010 10101011 00100101 

       

5.12 representation change from 32 byte output to human readable (not done by sha-256)

problem with 32 byte output    

raw output difficult for human to read and store   

       

hexadecimal number system with a base of 16   

decimal -- base 10 -- digits 0-9    

hex -- 6 units -- a-f = 11-15    

       

64 character hexadecimal string    

a standard representation of the sha-256 output (after computation)  

the 32-byte output is converted into a 64-character hexadecimal string  

the conversion methodology is stable - same 32 byte patterns produce same 64 character hexadecimal strings

more human readable and easier to store   

each byte is represented by a combination of 2 characters  

32 bytes x 2 characters per representation = 64 characters   

the characters are referred to as hexadecimal because they are 0-9 and a-f specifically 

the hexadecimal representation is reversible - can be converted back into bytes  

fingerprint (raw)fixed 32 byte sha-256 output   

fingerprint (display)the 32 byte output is converted and displayed as a 64 hexadecimal character string 


​6_constraints:
class: constraint
value:   expanded section
 

algorithm longevity

SHA-256 is widely supported and not considered broken 

future algorithms may supersede SHA-256 for some use cases 

this documentation reflects the current state of SHA-256 usage 

      

cryptographic collision risk

collision = two different byte inputs → same output  

note it’s mathematically possible, practically infeasible to find for sha-256 

accepted as outside the scope   
 
7_reproducibility:
class: reproducibility
value: concepts based on builder (subjective), not intended for reproducibility
​
8_notes:
class: note
value: none
END WORKPAPER

BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 03_workpapers___sha256_computation_independent_of_ execution_ method
workpaper id:  03.02___sha256_considerations___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
​ 
1_overview:
class: context
value: describe concepts related to sha-256 computation
 
2_source:
class: provenance
value: builder- authored
​
3_dependencies:
class: dependency
value: not applicable, concepts only - execution of control out of scope
​
4_exclusions:
class: exclusion
value: the internal mathematical operations that perform the sha-256 computation are out of scope for this documentation
 
5_subject_matter:
class: substance
value:  expanded

concepts grouped functionally, not alphabetically
 
​6_constraints:
class: constraint
value:   
 
7_reproducibility:
class: reproducibility
value:
​
8_notes:
class: note
value: 
END WORKPAPER



BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 03_workpapers___sha256_computation_independent_of_ execution_ method
workpaper id:  03.01___sha256_concepts___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD

​ 
1_overview:
class: context
value: 
 
2_source:
class: provenance
value:
​
3_dependencies:
class: dependency
value:
​
4_exclusions:
class: exclusion
value:
 
5_subject_matter:
class: substance
value:  
 
​6_constraints:
class: constraint
value:   
 
7_reproducibility:
class: reproducibility
value:
​
8_notes:
class: note
value: 
END WORKPAPER






























































































































 

bottom of page