can a human inspect the text directly? i'm not sure.
02_control_data_integrity
​​​BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 01_workpapers___list_of_workpapers_in_c02
workpaper id: 01___workpaper_summary___d
first_published: YYYY-MM-DD
last_updated: YYYY-MM-DD
​
1_overview:
class: context
value: workpaper summary for control 02: data integrity control
2_source:
class: provenance
value: builder authored
3_dependencies:
class: dependency
value: not applicable, this is a workpaper summary
4_exclusions:
class: exclusion
value: not applicable, this is a workpaper summary
5_subject_matter:
class: substance
value: expanded section
workpapers (machine readable + human verifiable)
01.01___workpaper_summary___d
02.02___control_overview___d
02.02___control_container_excel___d
02.03___control_example_excel___d
​
03.01___sha256_concepts___d
03.02___sha256_considerations___d
03.03___sha256_source___d
​
04.01___vba_id___d
04.02___vba_container___d
04.03___vba_environment_requirements___d
04.04___vba_script___d
​
05.01___python_id___d
05.01___python_id___d
05.03___python_environment_requirements___d
05.04___python_script___d
appendixes (human aids, machine ignore)
a.01___vba_set_up_notes___a
a.02___python_background_notes___a
​
flowcharts (human aids, machine ignore)
f.01___control_overview___fc
f.02___data_transformation_simple__fc
f.03___data_transformation_expanded__fc
f.04___data_reversibility_boudary-map___fc
f.05___vba_flow___fc
​
6_constraints:
class: constraint
value: not applicable, control policy documentation only
​
7_reproducibility:
class: reproducibility
value: not applicable, controls subjectively implemented by builder
8_notes:
class: note
value: none
END WORKPAPER
​
BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 02_workpapers___control_overview
workpaper id: 02.01___control_overview___D
first_published: YYYY-MM-DD
last_updated: YYYY-MM-DD
​
1_overview:
class: context
value: overview of control 02: data integrity control
2_source:
class: provenance
value: builder designed control (no external standard asserted)
3_dependencies:
class: dependency
value: not applicable, this is a workpaper summary
4_exclusions:
class: exclusion
value: expanded section
control design subjectivity
control creation is subjectively designed by builder - another builder may choose a different control
errors introduced prior to static computation generation are outside scope
correctness of source data
this control does not validate correctness of source data
errors introduced prior to static computation generation are outside scope
intentional manipulation
the control does not detect intentional data manipulation occurring prior to static computation generation
other human error scope
the control does not prevent or correct human error outside the specified control process
semantic meaning
this control does not detect semantic equivalence
semantic equivalence = meaning of control data
5_subject_matter:
class: substance
value: expanded section
assurances
if the control passes static and live computation results are equal at the evaluated scope
this provides assurance that no detectable change has occurred to the selected data since static computation was generated
this assurance applies only to the exact data selection encoding and boundaries used
this assurance does not extend beyond the evaluated scope
this assurance does not assert correctness completeness or intent
cell error/non-value handling
if a cell is #N/A, #VALUE!, #DIV/0!, etc. the cell return is addressed at the occurance level
no resolution occurs within this control workpaper
resolution, if any, is done at the dataset level
evaluation and resolution are out of scope
change detection logic
change is detected when static and live computation results are not equal
detection is binary (change = 1 / no change = 0)
a value of 1 indicates the presence of change only; it does not quantify the number or magnitude of changes
no conclusion, diagnosis, or cause of change is interpreted within this logic
change resolution
detected changes are addressed at the occurrence level
no resolution occurs within this control
evaluation and resolution are out of scope
defined
control compares a static computation to a live computation for the purpose of detecting a change (if any)
changes detected by manually reviewing the change-indicator calculation cell(s), where:
static computation result = live computation result → 0
static computation result ≠ live computation result → 1
results are reviewed manually by the user
data selection defined by user and varies
computation is SHA-256
execution
static computation is generated once and stored; live computation is continuously derived from current data
equality comparison is executed by formula; interpretation is performed by the user
detected differences and confirmations of no change are subject to manual review
the control operates continuously and is not limited to a point-in-time execution
implementation safeguards (optional)
optional safeguards may be applied at implementation time
examples (non-exhaustive):
- freezing control columns
- protection against accidental formula overwrite
- separate storage of static computation
- table-based data selection
safeguard selection and implementation may vary by:
- database
- dataset
- execution environment (excel, vba, python)
absence of safeguards does not affect:
- change detection logic
- computation correctness
- sha-256 determinism
input basis
control computes SHA-256 over raw cell values, not displayed or formatted values
scope
the control applies to any user-defined data selection, including but not limited to:
single symbol(s), cell(s), column(s), row(s), table(s), or file-level outputs
control tests output equivalence only
the control operates on the selected data as a whole, not on semantic meaning
scope is defined at execution time by the user or system configuration
​6_constraints:
class: constraint
value: expanded section
algorithm dependency
this control relies on deterministic computation, not on a specific algorithm
the computation algorithm used by the control may change over time
algorithm replacement does not alter the control logic or comparison method
excel related dependencies
correct execution of the computation function
integrity of the spreadsheet environment
nature of manual review control
this control requires manual review
manual processes are by nature subject to risk
unintentional manipulatiom
the control is designed to detect accidental and unintended changes
the control may surface intentional changes incidentally
7_reproducibility:
class: reproducibility
value: not applicable, this is a control overview
​
8_notes:
class: note
value: none
END WORKPAPER​​
​
BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 02_workpapers___control_overview
workpaper id: 02.02___control_container___D
first_published: YYYY-MM-DD
last_updated: YYYY-MM-DD
​
1_overview:
class: context
value: documents microsoft excel as the execution, storage, and review container for control
2_source:
class: provenance
value: microsoft office (commercial software product)
​
3_dependencies:
class: dependency
value: not applicable, workpaper is to document container not control itself
4_exclusions:
class: exclusion
value: not applicable, workpaper is to document container not control itself
5_subject_matter:
class: substance
value: expanded section
configuration considerations
excel behavior may vary based on:
excel version
regional settings
calculation mode (automatic vs manual)
workbook protection settings
these factors may affect reproducibility and must be controlled or acknowledged by user
container role
microsoft excel serves as the execution, storage, and review container for the di01 control
excel provides:
a structured grid for data selection
formula-based computation and comparison
persistent storage of static computation results
visual inspection of change-indicator outputs
excel does not define the control logic or computation mechanism
execution context
the DI01 control executes within excel via worksheet formulas
execution characteristics:
static computation results are stored as fixed values in cells
live computation results are continuously derived from current cell values
equality comparison is executed by formula
change-indicator output is visible immediately upon data modification
no batch job or scheduled execution is required
storage behavior
excel stores:
source data values
static computation results
live computation formulas
change-indicator formulas
excel does not enforce immutability of stored values
persistence relies on standard workbook save behavior
user interaction and review
the user:
defines data selection boundaries
initiates static computation generation
observes change-indicator outputs
performs manual review when change is detected
excel provides visibility but does not perform interpretation or resolution
​6_constraints:
class: constraint
value: not applicable, workpaper is to document container not control itself
7_reproducibility:
class: reproducibility
value: not applicable, workpaper is to document container not control itself
​
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 02_workpapers___control_overview
workpaper id: 02.03___control_example_excel___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
​
1_overview:
class: context
value: provide an illustrative example of control 02
2_source:
class: provenance
value: builder-authored
​
3_dependencies:
class: dependency
value: not applicable, workpaper is illustrative only
​
4_exclusions:
class: exclusion
value: not applicable, workpaper is illustrative only
5_subject_matter:
class: substance
value: expanded
example control
illustrates control at illustrative level only
example demonstrates how control 02 detects change, not how sha256 is computed
formulas in "row 1" are descriptive not live excel formulas
control output rule:
• Output 0 when static and live sha256 values are identical
• Output 1 when static and live sha256 values differ
control illustrative example
illustrative data = hdkslHSKD
static sha-256 stored sha-256 computation result
live sha-256=computeded SHA-256 computation
computation formula=sha256_text(hdksIHSKD)
static sha-256=e6386e29cc15b45e434e6c2316cecee7a79ac82ef068d1d390de52cf9472d960
live sha-256=e6386e29cc15b45e434e6c2316cecee7a79ac82ef068d1d390de52cf9472d960
control formula = IF(input=input,0,1)
=[static] = [live] = 0
=[static] ≠ [live] = 1
control result = 0 (two sha-256 computations agree
notes:
stored result execution method varies - i.e. vba in excel, python
there are multiple excel formulas for calculation of sha256 computation
another formula could be used to identify differences in sha256 static and live compuations
​6_constraints:
class: constraint
value: not applicable, workpaper is illustrative only
7_reproducibility:
class: reproducibility
value:not applicable, workpaper is illustrative only
​
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 03_workpapers___sha256_computation_independent_of_ execution_ method
workpaper id: 03.01___sha256_concepts___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
​
1_overview:
class: context
value: describe concepts related to sha-256 computation
2_source:
class: provenance
value: builder- authored
​
3_dependencies:
class: dependency
value: not applicable, concepts only - execution of control out of scope
​
4_exclusions:
class: exclusion
value: the internal mathematical operations that perform the sha-256 computation are out of scope for this documentation
5_subject_matter:
class: substance
value: expanded section
concepts grouped functionally, not alphabetically
5.1 concept notes: not concept specific
descriptions are intentionally brief
descriptions grouped by function not alphabetical
descriptions are functional, not dictionary-based
concepts included were subjectively chosen by builder with GPT assistance
5.2 primitives: smallest units a system operates on directly
primitive
primitive the smallest unit a system treats as indivisible for its intended purpose within that system’s abstraction boundary
may have meaning, but is not decomposed further within the system
combined to build more complex structures, but are not broken down further within the system
5.3 display primitives: what is converted to computer primitives
symbol
a character (ie. A, 9, space, :, emoji, etc.)
some are human readable, and some are not (whitespace, non-printing characters, etc.)
symbols (sha-256)
not processed directly by sha-256
are interpreted only after encoding converts them into bytes
data scope
defined by explicit selection and boundary prior to computation by the end user
can be a single character, an invisible character (spaces, line breaks, formatting), a word, a row, a table, a file, a folder, etc.
can include invisible characters (spaces, line breaks, formatting)
5.4 computer primitives: what computers see/interpret
bit
single binary value (0 or 1) — the smallest unit of data representation
the smallest unit computers use to represent information
byte
a group of bits
how computers group bits meaningfully
one byte (grouping of 8) has 256 possible patterns (arrangements of 0s and 1s)
byte pattern
the sequence ordered by bits (0s and 1s)
computers map representation by the arrangement of bits into bytes
5.5 encoding: process of translating human readable symbols to computer readable bytes
encoding
can be likened to translating - from symbol to bytes (groups of bits - 0s and 1s)
encoding inputs: symbols identified by character set (Unicode code points or equivalent) prior to encoding
the mapping of symbols to byte patterns
encoding systems
stable schemas or frameworks that map symbol(s) to bytes (patterns of bits)
the same symbols is represented with different byte patterns by different schemas
byte patterns must be stable within an encoding framework for results to be deterministic and comparable
code unit
the fixed-size unit of storage used by an encoding scheme to represent code points
not a character or a symbol
it is an intermediate representation between symbols and raw bytes
ode units are defined by the encoding scheme, not by sha-256
UTF-8 uses 8-bit code units; UTF-16 uses 16-bit code units
one code point may map to one or more code units depending on the encoding
UTF-8 (encoding)
a type of byte mapping system where symbols (Unicode code points) are mapped to one or more 8-bit code units (8 bits - 1 byte)
symbol-to-byte mapping is stable across all UTF-8 systems
a given code point always maps to the same byte sequence under UTF-8
the mapping is deterministic (identical inputs = identical byte patterns)
the mapping is reversible (byte pattern to symbol and symbol to byte pattern)
UTF-16 (encoding)
a type of byte mapping system where symbols (Unicode code points) are mapped to one or two 16-bit code units (16 bites = 2 byte) (surrogate pairs for code points above U+FFFF)
symbol-to-byte mapping is stable across all UTF-16 systems
a given code point maps to the same code unit sequence under UTF-16
the mapping is deterministic (identical inputs = identical byte patterns) and reversible
the mapping is reversible (byte pattern to symbol and symbol to byte pattern)
5.6 sha-256 algorithm input concepts
input
whatever symbols are being delivered to sha-256 algorithm must be bytes first
sha-256 does not know what encoding system is used, it recognizes bytes patterns only
sha-256 cannot see symbols, words, or meaning—only byte sequences
boundaries for byte input(s) are defined by user (look at word, row, columns a + b, entire table, etc.)
input and encoding
different encodings ⇒ different byte arrays ⇒ different sha-256 output
Same encoding + same symbols ⇒ same bytes ⇒ same sha-256 output
assumes same byte order and concatenation order
possibility of collisions noted and deemed out of scope
byte stream construction
the order in which bytes are concatenated is significant
the same bytes in a different order produce a different result
e.g., concatenating columns A+B vs B+A produces different byte streams and different hashes
5.7 sha-256 algorithm and computation concepts
algorithm
a fixed set of rules for transforming input into output
the rules do not change based on content
only the input affects the result
deterministic
same inputs (same bytes) produce the same result
change in input bytes produces a different result except for cryptographic collisions
different encoding systems produce different results because they produce different byte patterns for the same symbol(s)
sha-256 algorithm
A fixed, deterministic algorithm that converts input bytes into a 32-byte output
SHA = Secure Hash Algorithm
256 bits fixed output (32 bytes × 8 bits per byte)
if algorithm changes, it is no longer sha-256
sha-256 computation
execution of the sha-256 algorithm
sha-256 determinism
assumes a fixed encoding and consistent byte stream construction
sha-256 stabilitysha-256 text specification has clarifications, but algorithm definition has not changed
any change would create a different algorithm, not sha-256
this stability is critical for compatibility, verification, and security
SHA -256 reversibility
raw output is non-reversible due to the many-to-one mapping from arbitrary-length inputs (effectively infinite possibilities) to a finite, fixed-length output space
the non-reversibility of the output makes the algorithm one way
5.8 sha-256 computation other factors (salting)
salting (explicitly excluded)
salt = extra bytes deliberately added to the input before sha-256 computation
no additional bytes are appended to input prior to sha-256 computation
extra data (salt) would cause identical inputs to produce different outputs when salt differs
salt is used to:
prevent precomputed (rainbow table) attacks
ensure identical passwords do not hash to the same value across different records
increase resistance to precomputation attacks
salt is good for passwords
salt is bad for deterministic comparison controls and will not be used
5.9 sha-256 computation other factors (keying)
keying (explicitly excluded)
key = secret value combined with input during sha-256 computation
key is applied deliberately before or during computation
keyed sha-256 computation produces different outputs for identical inputs when keys differ
how keying is used:
authentication and message integrity (e.g., HMAC)
ensures only parties with the key can reproduce or verify the hash
used to prove authenticity, not just detect change
why keying exists:
prevents unauthorized hash forgery
binds the hash result to possession of a secret
protects against tampering in adversarial environments
why keying is excluded here:
deterministic comparison requires identical inputs → identical outputs
secret keys would prevent independent precomputation
control is not performing authentication or trust enforcement
key management is out of scope
5.10 control position - keying and/or salting
no salt added
no secret keys used
sha-256 executed without key material
output comparability preserved across environments
5.11 sha-256 algorithm output concepts (machine readable)
output generalsha-256 takes bytes as input and produces bytes as output
different encoding systems produce different byte patterns
different byte input produces different byte output
32 byte output the raw result produced by the sha-256 algorithm
this output is always exactly 32 bytes (fixed output)
32 bytes x 8 bits per byte = 256 bits as in sha-256
raw output would be a grouping of 256 0s and 1s
below is made up example of one 32 byte output - 32 sets of 8 bits (0s and 1s)
01101010 00001001 11101010 11110110 00010101 11110010 10100101 11000111
00011010 10110101 11101101 00001111 10001001 00101010 10101011 00100101
01101010 00001001 11101010 11110110 00010101 11110010 10100101 11000111
00011010 10110101 11101101 00001111 10001001 00101010 10101011 00100101
5.12 representation change from 32 byte output to human readable (not done by sha-256)
problem with 32 byte output
raw output difficult for human to read and store
hexadecimal number system with a base of 16
decimal -- base 10 -- digits 0-9
hex -- 6 units -- a-f = 11-15
64 character hexadecimal string
a standard representation of the sha-256 output (after computation)
the 32-byte output is converted into a 64-character hexadecimal string
the conversion methodology is stable - same 32 byte patterns produce same 64 character hexadecimal strings
more human readable and easier to store
each byte is represented by a combination of 2 characters
32 bytes x 2 characters per representation = 64 characters
the characters are referred to as hexadecimal because they are 0-9 and a-f specifically
the hexadecimal representation is reversible - can be converted back into bytes
fingerprint (raw)fixed 32 byte sha-256 output
fingerprint (display)the 32 byte output is converted and displayed as a 64 hexadecimal character string
​6_constraints:
class: constraint
value: expanded section
algorithm longevity
SHA-256 is widely supported and not considered broken
future algorithms may supersede SHA-256 for some use cases
this documentation reflects the current state of SHA-256 usage
cryptographic collision risk
collision = two different byte inputs → same output
note it’s mathematically possible, practically infeasible to find for sha-256
accepted as outside the scope
7_reproducibility:
class: reproducibility
value: concepts based on builder (subjective), not intended for reproducibility
​
8_notes:
class: note
value: none
END WORKPAPER
BEGIN WORKPAPER
control orientation: 02_control___data_integrity_comparative_computation_to_detect_change
section orientation: 03_workpapers___sha256_computation_independent_of_ execution_ method
workpaper id: 03.02___sha256_considerations___d
first published: YYYY-MM-DD
last updated: YYYY-MM-DD
​
1_overview:
class: context
value: describe concepts related to sha-256 computation
2_source:
class: provenance
value: builder- authored
​
3_dependencies:
class: dependency
value: not applicable, concepts only - execution of control out of scope
​
4_exclusions:
class: exclusion
value: the internal mathematical operations that perform the sha-256 computation are out of scope for this documentation
5_subject_matter:
class: substance
value: expanded
concepts grouped functionally, not alphabetically
​6_constraints:
class: constraint
value:
7_reproducibility:
class: reproducibility
value:
​
8_notes:
class: note
value:
END WORKPAPER