Interpretation for scales of measurement linking with abstract algebra

The Stevens classification of levels of measurement involves four types of scale: “Nominal”, “Ordinal”, “Interval” and “Ratio”. This classification has been used widely in medical fields and has accomplished an important role in composition and interpretation of scale. With this classification, levels of measurements appear organized and validated. However, a group theory-like systematization beckons as an alternative because of its logical consistency and unexceptional applicability in the natural sciences but which may offer great advantages in clinical medicine. According to this viewpoint, the Stevens classification is reformulated within an abstract algebra-like scheme; ‘Abelian modulo additive group’ for “Ordinal scale” accompanied with ‘zero’, ‘Abelian additive group’ for “Interval scale”, and ‘field’ for “Ratio scale”. Furthermore, a vector-like display arranges a mixture of schemes describing the assessment of patient states. With this vector-like notation, data-mining and data-set combination is possible on a higher abstract structure level based upon a hierarchical-cluster form. Using simple examples, we show that operations acting on the corresponding mixed schemes of this display allow for a sophisticated means of classifying, updating, monitoring, and prognosis, where better data mining/data usage and efficacy is expected.


Background
In 1946, S. S. Stevens devised his classification of "levels of measurement" [1], which subsequently has been used widely and has accomplished an important role in composition and interpretation of scales in medical fields. The systematics of levels of measurement seems to have been organized and validated by virtue of this classification. Nevertheless, we believe that an abstract algebralike interpretation/systematization awaits introduction because of its logical consistency and unexceptional applicability in describing patterns and processes. We conjecture that it offers benefits in clinical medicine, especially, with respect to scales of measurement [2,3].
Thus, in the following, we re-interpret Stevens classification, and endeavour to give it meaning in some abstract algebra-like modelling. There, the most preferred construct is a vector-like structure of various sets of scores based on individual scales and operators that permit changes of score within the set. Additionally, classical datasets that are classified in terms of the Stevens scales of measurement can be mined and combined on a higher abstract structure level based upon a hierarchical-cluster form. To explore this possibility, we provide simple examples to help readers understand this modelling tool. §1. Application of group/field of abstract algebra to the various types of scales Stevens classified the scales of measurement into four scale types [1]; І) "Nominal scale" that uses only labels or numbers (e.g., numbering of football players, blood type, nationality); II) "Ordinal scale" that introduces equality, rank-ordering (e.g., hardness of minerals, grading for efficacy of clinical treatment); III) "Interval scale" that is based on equally quantitative intervals (e.g., temperature as read in centigrade, duration, frequency); and ІV) "Ratio scale" that assumes a 'zero' as an origin, equality, rankorder, equality of intervals, and equality of ratios (e.g., absolute temperature, speed of vehicles, and most physical values) that then admit manipulations using the four arithmetic operations.
For І), the "Nominal scale", there seems to be little room where group theoretical operations apply because within that scale only a labelling scheme is permissible. Although some non-cyclic group might be definable, it seems that little meaning can be attached to operations for this sort of scale.
For II), the "Ordinal scale", a ranking is realised by introducing a set with an N-graded scoring like '1, 2, 3,…, N -1, N' (N: positive integer) for a score deficient in (or with no absolute need for) a quantitative character, but not requiring a '0' score according to the Stevens classification. Historically, the "graphic rating scale", a grading from І to V, was proposed by Hayes and Patterson in 1921 [4], and Freyd in 1923 [5]. However, here, we envisage either operations that decrease the score by '1' in an N-graded graphic scale necessitating a '0' , so that {0, 1, 2, 3,…, N -2, N -1} establishes the scoring scale, or simply adding the score '0' as in {0, 1, 2, 3,…, N -2, N -1, N}. We focus on the former type. Then, for an arbitrary nonnegative integer X, the operation giving the remainder of X after division by N, written X (mod N), defines the cyclic group Z N = {0, 1, 2, 3,…, N -2, N -1}, where modulo N addition is postulated. With this assumption, given two elements 'X j ' and 'X k ' (X j , X k ∈ Z N ) corresponding for example to the severity of a clinical symptom and/or finding, then composition (denoted by '*') is taken to be modulo N addition; 'X j *X (j→k) = X k ' (with X (j→k) ∈ Z N ). Here 'X (j→k) ' is an operator that produces the change in score, 'X j → X k ' (formally we have 'X (j→k) = X j −1 *X k = X k -X j '). Then, all scores 'X j 's and operators 'X (j→k) ' are composable within a single Abelian modulo additive group 'Z N ' , where 'X j *X k = X k *X j ' holds, at least, in terms of operation '*'. Thus a patient's state corresponding to a certain illness or disease can be changed through the application of a single operation determined by the two elements belonging to 'Z N ' [6,7] representing the previous and current state of the patient. A simple example is presented in Appendix A.
If a state of maximum severity is present, then the antithesis for any given disease Y is the ideal healthy state E Y = [0|0|0|0|0|…], the combination of all scores being '0' and represented by the identity element for group Y = {Z N ×n , *}. Here, Y is the n-fold Cartesian product of 'Z N ' (n: the number of components) that comprises all possible assessments related to each state of a given disease, for instance, 'hypertension' , 'hyperglycaemia' , 'diabetes mellitus' , 'acute pancreatitis' , 'systemic lupus erythematosus' , and 'cerebral artery stroke'. If in addition composition is given by modulo 'N' arithmetic, prime numbers (e.g., N = 7) are preferable [8,9] and considerable parts of components could be overlapping among individual diseases as was mentioned in our previous reports [6,7]. Note that, in practice, equal increments within a grading scheme are not always postulated. Nevertheless, the scale represented by this Abelian modulo additive group 'Z N ×n ' will be called a "modular scale". However, it may be an atypical case (partially weakened example) of a "Ratio scale" (type ІV) without the strict requirement for equal calibration. Indeed, there are such scales because, like the 'TNM classification (with a 'T0' entry) for malignant tumours' [7,10], grades for scoring are determined for example according to histological characteristics, selection of treatment, and prognosis, having no strict linearity in scale, but which might be regarded as an "modular scale". Based upon these results, for instance, the following are considered composable; Abelian modulo additive group Y 1 = {Z 7 , *} for 'hypertension' , Y 2 = {Z 7 , *} for 'hyperglycaemia' , Y 3 = {Z 7 , *} for 'diabetes mellitus' , Y 4 for 'acute pancreatitis' , Y 5 for 'systemic lupus erythematosus' , Y 6 for 'cerebral artery stroke' , Y all = {Z 7 × Z 7 × Z 7 × …, *} = {Z 7 ×n , *} (n: the number of components) for an entire body, and Y 7 = {Z 8 × Z 4 × Z 2 × Z 2 , *} for the 'TNM classification (with a 'T0' entry) for malignant tumours' [7,10]. Additionally, these are treatable without exception within the abstract algebraic theory. For this case, an equal calibration for severity may have unbeneficial outcomes if used in clinical treatments. However, for 'delirium' , 'chronic liver dysfunction' , 'acute pancreatitis' , and 'diabetes mellitus' , for example, total scores based on equal calibration are desirable to assess disease severity.
For III), the "Interval scale", differences in quantities are allowed. An example is 'periods of time' or 'duration' , which, although can be measured with ratio scales, enables one period to be double another when compared. The same is true of 'temperature'. If parameters 'X j ' and 'X l ' ∈ R (the continuous real number line) have ranges we can consider an operator 'X k ' that causes changes from 'X j ' to 'X l ' , and introduce a binary operation, denoted '•', where ordinal addition and its inverse, subtraction, are assumed; In this regard, as for 'X j ' , it can also be expressed as a sum of an integer part and a decimal part, (m j = [X j ], c j = X j − [X j ], '0 ≤ c j < 1'; ' [X]' is the floor function meaning the highest integer below 'X'). Similarly, '1' is a 'unit length' of the respective values. Thus, (iii) − (v) can be redefined using the unit length '1' as an interval scale, There exists an identity element . Naturally, commutativity and associativity are satisfied. Let U be the set that comprises all 'X j 's, i.e., U ≡ {X j | X j ∈ R}. Because 'X j , X k , X j ∈ set U, the closure law holds. Therefore, this operation defines a group U = {X j , •} [2,3]. "Body temperature readings", "clock time for the onset of sleep within a day" and "clock time for the onset of drip infusion within a day" are definable in this scale. Examples of the first two are provided in Appendix B. By making use of this procedure, the differences between quantitative values and operators are eliminated, and both can be regarded as elements belonging to a single group U. Moreover, a collection of additive Abelian groups U 1 ≡ {X 1j | X 1j ∈ R (deg C)} based on an individual's clinical values can be described as, as for example U 1 = {X 1j , •} for "body temperature readings", and U 2 ≡ {X 2j | X 2j ∈ R (/24 hrs)} and U 2j = {X 2j , •} for "clock time for the onset of sleep within a day", U 3 ≡ {X 3j | X 3j ∈ R (/24 hrs)} and U 3j = {X 3j , •} for "clock time for the onset of drip infusion within a day",…, U N = {X Nj , •},…, (N: natural number). Those are considered readily treatable and recordable within an abstract algebraic context.
For IV), the "Ratio scale", the 'administration of medicine (with strict dosage regimes)' and 'International Statistical Classification and Health Related Problems' [11] were given as examples in our previous report [6,7]. Essentially, for this scale, because the four arithmetic operations are possible, 'rings' and 'fields' in abstract algebra are applicable so long as composition is given by modulo 'N' arithmetic with 'N' a prime. Although there could be scope where the four modulo arithmetic operations (denoted by ' †' in 'X j †X k = X l ') are applicable in assessment scoring in clinical medicine, it might be preferable at this stage to confine the application of ratio scales to just modulo N addition '*' collectively for ' †' , similar in manner as established in Appendix A. For the example given in Appendix A, the difference in interpretation is the presence/absence of an equal calibration.
Whereas the scale of 'TNM classification for malignant tumours' [10] was regarded as an example of an "Ordinal scale", some of the scales defined as "Ratio scales" at initial glance should be regarded as "Ordinal scales" accompanied with '0'. It might be contentious whether clinical assessments performed using superficial scales based on the four arithmetic operations could have sufficient validity in clinical treatments or clinical research.
Nevertheless, other clinical scales range over a semiopen continuous interval like '0 ≤ X < +∞' (X: real number), such as 'blood concentration of white blood cells: − ]) (reference range for blood tests: 12 ± 2 mEq/l)' and 'Base excess [BE] (reference range for blood tests: 0 ± 2 mmol/l)'. However, both can be treated using the notion of 'field' because those values are real numbers where all four arithmetic operations are included, with the exception of division by zero. Thus, the above clinical values could be definable over a 'field'. In this regard, we assume a rule that each unit like 'mEq/l' accompanies the value automatically with the results of operations regardless of types of operation among the four arithmetic operations (Note that there are cases when units vanish as when ratios are taken 'mEq/mEq (unitless)' or displayed in reciprocal form like 'l/mEq'). Examples for '[WBC] (/mm 3 )' , '[Na + ] (mEq/l)' are presented in Appendix C.
In this case, we consider a set V and assume that '#' means one of 'addition, subtraction, multiplication, and division' collectively; thus, 'X j # X k = X l (∈V), where ordinal arithmetic calculations are performed excluding of course division by zero.
For set V, addition is commutative: X j + X k = X k + X j , and associative: (X j + X k ) + X l = X j + (X k + X l ). As for multiplication, set V meets the conditions of a 'monoid' [2,3]. Associativity: (X j × X k ) × X l = X j × (X k × X l ), with Left and Right Distributivity: is definable except for division by zero. Therefore, we can confirm that set V is a 'field'. It can be expressed as V = {X j , #} or V = {X j | X j ∈ R}.
Furthermore, different fields based on different sets of clinical values can be described as follows: field V 1 ≡ {X 1j | X 1j ∈ R (/mm 3 )} and V 1 = {X 1j , #} for "blood concentration of white blood cells: [WBC] (/mm 3 )", field V 2 ≡ {X 2j | X 2j ∈ R (mEq/l)} and V 2 = {X 2j , #} for "administration of a certain drug like lithium carbonate: [Li + ] (mEq/l)", field V 3 ≡ {X 3j | X 3j ∈ R (mEq/l)} and V 3 = {X 3j , #} for "sodium: [Na + ] (mEq/l)", field V 4 ≡ {X 4j , #} for calcium: [Ca ++ ] (mg/dl), field V 5 for chloride: [Cl − ] (mEq/l), field V 6 for ' Anion gap [AG] (mEq/l)', field V 7 for 'Base excess [BE] (mmol/l)',…, V N ,…, (N: natural number). For each, an independent abstract algebraic treatment is possible as for ordinal abstract algebra. §2. A vector-like notation using group/field operations belonging to a single set By making use of all types of scales of measurement, we propose a vector-like expression of a patient's state (denoted 'R j ' , j = 1, 2, 3,…: number of sessions), where the mixed expression and its totality of operations that could be performed belong to a single set R. Because of the possible variety of operation rules, the genuine use of this set may be unwieldy at this stage.
As for the possible application to better data mining or data usage from the viewpoint of our reinterpretation, we provide a simple example that may help readers to follow an outline of the argument. Consider an example of 17 states "R 1 , R 2 , …, R 17 " (∈set R) each with four component ('n = 5') and arrows (only symbols) that indicate the possible changes among the 'R j 's, as displayed in Figure 1. The scheme covers the notation of our model, and also that of existing methods where (possible) results of data, 'R j 's, are not combined directly with each other in the sense of operations. Then, the arrows could be re-displayed according to our concepts as operators 'R (j→k) that can be regarded as elements 'R j ' belonging to a set R as in Figure 2. In ordinal data sets, the 'R j 's are merely a collection of values and the arrows in Figure 1 are only marks. However, in our interpretation, all 'R j 's and 'R (j→k) 's are elements of a single set R subject to axioms of an abstract algebra as indicated using composition symbol '◊' in Figure 3. There, the changes 'from R j to R k ' can be traced at each session. Displayed in this way, Figure 3 represents an "operational tree" that could offer potential for better data mining/data usage through a more generalized/concise treatment (e.g., withdrawing/recording correspond to schemes in Figure 3) that might be permissible. Practical improvements for efficacy, however, will need future investigations.
Here, consider the scenario of Figure 1 where from an initial value 'R 1 ' there are four outcomes 'R 6 ' , 'R 9 ' , 'R 16 ' , and 'R 17 ' containing nodes at 'R 2 ' 'R 4 ' 'R 10 ' 'R 12 ' and 'R 13 '. By making use of our previous examples 'R 1 − R 3 ' , the next simplest examples with 'n (component number) = 5' can be confirmed easily: Following these results, the next relations, according to the tree in Figure 3, can be obtained for instance: The operator expressions are evaluated in Appendix F. Similarly, the next sequences are definable in principle,  Arrows are interpreted as operators 'R (j→k) 's that could be regarded as ordinal elements 'R j 's belonging to a single set R. Each operator that changes 'R j to R k ' can be traced and its degree for each session identified from initial and final states. The final states (R 6 , R 9 , R 16 and R 17 ) can be traced back to any initiating state 'R 1 ' by performing an appropriate sequence of 'R (j→k) 's.  Figure 3 "Operational tree"-the compositional scheme using symbol '◊'. The tree of Figure 2 is re-illustrated using composition symbol '◊', where the operators are assumed to belong to the single set R. By admitting algebraic correspondences, this compositional scheme could potentially provide better data mining/data usage.
In general, we denote a node divergence 'R a to R b (=R a ◊R (a→b) = R b )' and 'R a to R c (=R a ◊R (a→c) = R c )' as 'R a [(◊R (a→b) )(◊R (a→c) )]' (a,b,c: non-negative integers); here '( ) ( )( )…' meaning simple juxtaposition. All paths belonging to the operational tree of Figure 3 can then be described/ recorded, for instance, as the sequence To display for easy recognition, for example, end states like 'R 6 , R 9 , R 16 , and R 17 ' and divergence point 'R 4 ' a notation '(=R 6 ), (=R 9 ), (=R 16 ) and (=R 17 )' might be considered. Hence, Moreover, composition with an operator as in operating on 'R (3 → 4) [(◊R (4 → 5) ◊ R (5 → 6) )(◊R (4 → 8) ◊ R (8 → 9) ◊ R (9 → 10) ) (◊R (4 → 15) ◊ R (15 → 16) ) …] from the left-hand side by 'R 3 '. The subsequent result can be expressed in accordance with the single scheme presented in Figure 3, Note that the above descriptions (ix)-(xi) express oneto-many functionality. However, we think that these formulae are the algebra equivalent to the single operational tree as exemplified by Figure 3. These play the algebraic role in composite record-keeping in applied fields such as medicine. In this formalism, any possible result 'R j ' (∈set R) is obtained and traceable from any state 'R k ' under operations involving a plurality of elements belonging to a single set R.
Additionally, we can include data mining in a more symbolic/abstract way as follows. For an arbitrary j (j = 1, 2, 3,…, m), a hierarchical-cluster-like expression can be defined [12]. For instance, if a partition of R j is a set of subsets H = { 1 R j , 2 R j , 3 R j ,…, r R j } such that (1) R j ∈ H; (2) for all single sets s R j in R j , s R j ∈ H; and (3) ' s R j ∩ t R j ∈ {ϕ, s R j , t R j }' for all s ≠ t = 1, 2,…, r. That is, condition (3) means that either any two clusters ' s R j and t R j ' are disjoint, or one cluster is contained entirely inside the other, and every individual R j is contained in at least one cluster larger than itself. Note that if ' s R j ∩ t R j = ϕ' for all s ≠ t, then the hierarchy becomes a partitioning. Henceforth, reference to a hierarchy implies that ' s R j ∩ t R j = ϕ' for at least one set of (s, t) values. In the previous example (vii), 'R j ' could be expressed in hierarchical-cluster notation where there are eight clusters (and relabeling within clusters) as shown in Figure 4. If R j comprises ' 1 R j 1 and 2 R j 1 ' , the first level of hierarchy, 'R j = 1 R j 1 ∪ 2 R j 1 ' holds. At the second level, ' 1 R j 1 = 11 R j 2 ∪ 12 R j 2 ' = [X (j)1 (mod 7) | X (j)2 (/24 hrs)], ' 2 R j 1 = 21 R j 2 ∪ 22 R j 2 ' = [X (j)3 (/mm 3 ) | X (j)4 (mEq/l)|X (j)5 (…) obtain the complete set R j = {X (j)1 , X (j)2 , X (j)3 , X (j)4 , X (j)5 } = [X (j)1 (mod 7) | X (j)2 (/24 hrs) | X (j)3 (/mm 3 ) | X (j)4 (mEq/l) | X (j)5 (…)]. A hierarchy has additional levels as necessary to reach single units at its base [12]. The top level is the entire dataset 'R j ' and that is always composable using base units. That is, arbitrary 'R k ' and 'R 1 ' can be combined into a single dataset as with positive integers). In this way, classical datasets that are classified in the Stevens scales of measurement could be mined and combined on a higher abstract structure level.
To help better understand the concept, a sequence of schemes illustrating the principles of our model is presented in Figure 5. Subject to future improvements, we envisage that this compact description is versatile to provide better data mining/data usage than from existing methods, although a final version is far from complete at this early stage.

§3. Supplementary suggestions and limitations
If the four arithmetic operations are appropriate in handling the values from clinical assessments, representation by "Ratio scales" (in some cases, the "modular scale" with suitable modulo number previously mentioned) might be effective in describing the clinical treatments or studies. The "Numerical rating scale (NRS)" with range '0-10' [13,14] illustrates the point where the modulo 11 additive group 'Z 11 ' arises as a natural modular scale. In contrast, similar approaches might be difficult for a "visual analogue scale" [15,16] where values could take any real number.
Whereas rating scales systemized as abstract algebralike form may enable a more generalized/sophisticated understanding, establishing a link between fields of clinical medicine and abstract algebra, and mixed states and operators in vector-like notation as in (vii)-(xi), does not always assure more concise manipulations. A mixed treatment as exemplified in (vii)-(xi) might not always yield optimal results at present. In general, combining group and field-like structures within 'R j ' may cause some confusion in handling the 'R j 's although benefits accrue through operational compliance and convenience in dealing with the abstract algebra. For description and records, a vector-like definition 'R j ' may not always be advantageous in which only the four types I)-IV) are used (particularly for 'I)' , the "nominal scale", where systematization of operation seems to be impossible). Nevertheless, we infer that in the handling of operations in mixed-notation like 'R j ' , the classification and synthesis of scales of measurement in some group/field-like form may be devised in a more rigorous methodology in future improvements.
That apart, similar, redundant, and obscure components may have been incorporated into the 'R j 's description without discretion. The 'R j ' in such instances loses validity and versatility in terms of a concise composition of scales. This is considered to result from the fact that a total state of a certain disease or a condition of a patient Figure 5 Schemes for data mining and combination in some higher abstract structure level. First, the classical dataset, classified by the four types of scales of measurement (Stevens classification), is re-interpreted as a group/field-like operational structure. Second, a vector 'R j ' is defined that is composed as the product of each type of operation for all datasets other than for those classified as "Nominal scale". Third, the 'R j 's are constituted as an "operational tree". Fourth, 'R j ' with any arbitrary j (j = 1, 2,…, m) could be mined and combined as "hierarchical clusters". is not always composable or describable via the combination of partial components. This implies that a larger number of components is not always desirable for assessment or rating scales.
Unfortunately, almost all current assessment scales in medicine are handled as if they were ratio scales although almost all are just ordinal scales. That might introduce considerable futility and/or waste of scientific resources. As previously indicated, some clinical scales (e.g., TNM classification) should be represented as an ordinal scale accompanied by '0' with no absolute need for a quantitative calibration (modular scale). Although a combination composed of entirely ratio scales seems to be difficult or impossible, we believe at least that appropriate operational structures (e.g., group, field) should always be selected that satisfied the conditions in instances like composition of scale, analysis, and interpretation of the results. These structures must be recognized clearly by users per each assessment to avoid misestimation, overconfidence, and complacency in scales.

Conclusions
The Stevens classification of scales of measurement can be re-interpreted and modelled as some abstract algebralike systematization. Moreover, a vector-like notation using mixed types of operations and a hierarchical structure-like systematization are possible enabling a sophisticated means to classify, update, monitor, and forecast patient treatments. Better data mining/data usage and efficacy is expected and will be considered in future studies.