Open Access

Interpretation for scales of measurement linking with abstract algebra

Journal of Clinical Bioinformatics20144:9

https://doi.org/10.1186/2043-9113-4-9

Received: 28 January 2014

Accepted: 2 June 2014

Published: 10 June 2014

Abstract

The Stevens classification of levels of measurement involves four types of scale: “Nominal”, “Ordinal”, “Interval” and “Ratio”. This classification has been used widely in medical fields and has accomplished an important role in composition and interpretation of scale. With this classification, levels of measurements appear organized and validated. However, a group theory-like systematization beckons as an alternative because of its logical consistency and unexceptional applicability in the natural sciences but which may offer great advantages in clinical medicine. According to this viewpoint, the Stevens classification is reformulated within an abstract algebra-like scheme; ‘Abelian modulo additive group’ for “Ordinal scale” accompanied with ‘zero’, ‘Abelian additive group’ for “Interval scale”, and ‘field’ for “Ratio scale”. Furthermore, a vector-like display arranges a mixture of schemes describing the assessment of patient states. With this vector-like notation, data-mining and data-set combination is possible on a higher abstract structure level based upon a hierarchical-cluster form. Using simple examples, we show that operations acting on the corresponding mixed schemes of this display allow for a sophisticated means of classifying, updating, monitoring, and prognosis, where better data mining/data usage and efficacy is expected.

Keywords

Scales on measurement Stevens classification Interpretation Abstract algebra Data-mining Hierarchical cluster Clinical medicine

Background

In 1946, S. S. Stevens devised his classification of “levels of measurement” [1], which subsequently has been used widely and has accomplished an important role in composition and interpretation of scales in medical fields. The systematics of levels of measurement seems to have been organized and validated by virtue of this classification. Nevertheless, we believe that an abstract algebra-like interpretation/systematization awaits introduction because of its logical consistency and unexceptional applicability in describing patterns and processes. We conjecture that it offers benefits in clinical medicine, especially, with respect to scales of measurement [2, 3].

Thus, in the following, we re-interpret Stevens classification, and endeavour to give it meaning in some abstract algebra-like modelling. There, the most preferred construct is a vector-like structure of various sets of scores based on individual scales and operators that permit changes of score within the set. Additionally, classical datasets that are classified in terms of the Stevens scales of measurement can be mined and combined on a higher abstract structure level based upon a hierarchical-cluster form. To explore this possibility, we provide simple examples to help readers understand this modelling tool.

§1. Application of group/field of abstract algebra to the various types of scales

Stevens classified the scales of measurement into four scale types [1]; І) “Nominal scale” that uses only labels or numbers (e.g., numbering of football players, blood type, nationality); II) “Ordinal scale” that introduces equality, rank-ordering (e.g., hardness of minerals, grading for efficacy of clinical treatment); III) “Interval scale” that is based on equally quantitative intervals (e.g., temperature as read in centigrade, duration, frequency); and ІV) “Ratio scale” that assumes a ‘zero’ as an origin, equality, rank-order, equality of intervals, and equality of ratios (e.g., absolute temperature, speed of vehicles, and most physical values) that then admit manipulations using the four arithmetic operations.

For І), the “Nominal scale”, there seems to be little room where group theoretical operations apply because within that scale only a labelling scheme is permissible. Although some non-cyclic group might be definable, it seems that little meaning can be attached to operations for this sort of scale.

For II), the “Ordinal scale”, a ranking is realised by introducing a set with an N-graded scoring like ‘1, 2, 3,…, N – 1, N’ (N: positive integer) for a score deficient in (or with no absolute need for) a quantitative character, but not requiring a ‘0’ score according to the Stevens classification. Historically, the “graphic rating scale”, a grading from І to V, was proposed by Hayes and Patterson in 1921 [4], and Freyd in 1923 [5]. However, here, we envisage either operations that decrease the score by ‘1’ in an N-graded graphic scale necessitating a ‘0’, so that {0, 1, 2, 3,…, N – 2, N – 1} establishes the scoring scale, or simply adding the score ‘0’ as in {0, 1, 2, 3,…, N – 2, N – 1, N}. We focus on the former type. Then, for an arbitrary non-negative integer X, the operation giving the remainder of X after division by N, written X (mod N), defines the cyclic group ZN = {0, 1, 2, 3,…, N – 2, N – 1}, where modulo N addition is postulated. With this assumption, given two elements ‘Xj’ and ‘Xk’ (Xj, Xk ZN) corresponding for example to the severity of a clinical symptom and/or finding, then composition (denoted by ‘*’) is taken to be modulo N addition; ‘Xj*X(j→k) = Xk’ (with X(j→k) ZN). Here ‘X(j→k)’ is an operator that produces the change in score, ‘Xj → Xk’ (formally we have ‘X(j→k) = Xj-1*Xk = Xk – Xj’). Then, all scores ‘Xj’s and operators ‘X(j→k)’ are composable within a single Abelian modulo additive group ‘ZN’, where ‘Xj*Xk = Xk*Xj’ holds, at least, in terms of operation ‘*’. Thus a patient’s state corresponding to a certain illness or disease can be changed through the application of a single operation determined by the two elements belonging to ‘ZN’ [6, 7] representing the previous and current state of the patient. A simple example is presented in Appendix A.

If a state of maximum severity is present, then the antithesis for any given disease Y is the ideal healthy state EY = [0|0|0|0|0|…], the combination of all scores being ‘0’ and represented by the identity element for group Y = {ZN×n, *}. Here, Y is the n-fold Cartesian product of ‘ZN’ (n: the number of components) that comprises all possible assessments related to each state of a given disease, for instance, ‘hypertension’, ‘hyperglycaemia’, ‘diabetes mellitus’, ‘acute pancreatitis’, ‘systemic lupus erythematosus’, and ‘cerebral artery stroke’. If in addition composition is given by modulo ‘N’ arithmetic, prime numbers (e.g., N = 7) are preferable [8, 9] and considerable parts of components could be overlapping among individual diseases as was mentioned in our previous reports [6, 7]. Note that, in practice, equal increments within a grading scheme are not always postulated. Nevertheless, the scale represented by this Abelian modulo additive group ‘ZN×n’ will be called a “modular scale”. However, it may be an atypical case (partially weakened example) of a “Ratio scale” (type ІV) without the strict requirement for equal calibration. Indeed, there are such scales because, like the ‘TNM classification (with a ‘T0’ entry) for malignant tumours’ [7, 10], grades for scoring are determined for example according to histological characteristics, selection of treatment, and prognosis, having no strict linearity in scale, but which might be regarded as an “modular scale”. Based upon these results, for instance, the following are considered composable; Abelian modulo additive group Y1 = {Z7, *} for ‘hypertension’, Y2 = {Z7, *} for ‘hyperglycaemia’, Y3 = {Z7, *} for ‘diabetes mellitus’, Y4 for ‘acute pancreatitis’, Y5 for ‘systemic lupus erythematosus’, Y6 for ‘cerebral artery stroke’, Yall = {Z7 × Z7 × Z7 × …, *} = {Z7×n, *} (n: the number of components) for an entire body, and Y7 = {Z8 × Z4 × Z2 × Z2, *} for the ‘TNM classification (with a ‘T0’ entry) for malignant tumours’ [7, 10]. Additionally, these are treatable without exception within the abstract algebraic theory. For this case, an equal calibration for severity may have unbeneficial outcomes if used in clinical treatments. However, for ‘delirium’, ‘chronic liver dysfunction’, ‘acute pancreatitis’, and ‘diabetes mellitus’, for example, total scores based on equal calibration are desirable to assess disease severity.

For III), the “Interval scale”, differences in quantities are allowed. An example is ‘periods of time’ or ‘duration’, which, although can be measured with ratio scales, enables one period to be double another when compared. The same is true of ‘temperature’. If parameters ‘Xj’ and ‘Xl R (the continuous real number line) have ranges
- < X < +
(i)
we can consider an operator ‘Xk’ that causes changes from ‘Xj’ to ‘Xl’, and introduce a binary operation, denoted ‘◦’, where ordinal addition and its inverse, subtraction, are assumed;
X j X k = X j + X k = X l j , k , l ; session numbers
(ii)
In this regard, as for ‘Xj’, it can also be expressed as a sum of an integer part and a decimal part,
X j = 1 m j + c j
(iii)
(mj = [Xj], cj = Xj - [Xj], ‘0 ≤ cj < 1’; ‘[X]’ is the floor function meaning the highest integer below ‘X’). Similarly,
X k = 1 m k + c k m k = X k , c k = X k - X k , 0 c k < 1
(iv)
X l = 1 m l + c l m l = X l , c l = X l - X l , 0 c l < 1
(v)
‘1’ is a ‘unit length’ of the respective values. Thus, (iii) - (v) can be redefined using the unit length ‘1’ as an interval scale,
X j X k = 1 m j + c j + 1 m k + c k = 1 m j + m k + c j + c k = 1 m l + c l = X l
(vi)

There exists an identity element ‘X0’ (=0) that satisfies ‘Xj ◦X0 = X0 ◦Xj (=Xj + X0 = Xj + 0) = Xj’. Additionally, the inverse element is ‘Xj-1 = -Xj’ satisfying ‘Xj-1◦Xj = Xj ◦Xj-1 = Xj + Xj-1 = Xj - Xj = X0 (=0)’.

Naturally, commutativity and associativity are satisfied. Let U be the set that comprises all ‘Xj’s, i.e., U ≡ {Xj | Xj R}. Because ‘Xj , Xk, Xj set U, the closure law holds. Therefore, this operation defines a group U = {Xj, ◦} [2, 3]. “Body temperature readings”, “clock time for the onset of sleep within a day” and “clock time for the onset of drip infusion within a day” are definable in this scale. Examples of the first two are provided in Appendix B. By making use of this procedure, the differences between quantitative values and operators are eliminated, and both can be regarded as elements belonging to a single group U. Moreover, a collection of additive Abelian groups U1 ≡ {X1j | X1j R (deg C)} based on an individual’s clinical values can be described as, as for example U1 = {X1j, ◦} for “body temperature readings”, and U2 ≡ {X2j | X2j R (/24 hrs)} and U2j = {X2j, ◦} for “clock time for the onset of sleep within a day”, U3 ≡ {X3j | X3j R (/24 hrs)} and U3j = {X3j, ◦} for “clock time for the onset of drip infusion within a day”,…, UN = {XNj, ◦},…, (N: natural number). Those are considered readily treatable and recordable within an abstract algebraic context.

For IV), the “Ratio scale”, the ‘administration of medicine (with strict dosage regimes)’ and ‘International Statistical Classification and Health Related Problems’ [11] were given as examples in our previous report [6, 7]. Essentially, for this scale, because the four arithmetic operations are possible, ‘rings’ and ‘fields’ in abstract algebra are applicable so long as composition is given by modulo ‘N’ arithmetic with ‘N’ a prime. Although there could be scope where the four modulo arithmetic operations (denoted by ‘†’ in ‘Xj†Xk = Xl’) are applicable in assessment scoring in clinical medicine, it might be preferable at this stage to confine the application of ratio scales to just modulo N addition ‘*’ collectively for ‘†’, similar in manner as established in Appendix A. For the example given in Appendix A, the difference in interpretation is the presence/absence of an equal calibration.

Whereas the scale of ‘TNM classification for malignant tumours’ [10] was regarded as an example of an “Ordinal scale”, some of the scales defined as “Ratio scales” at initial glance should be regarded as “Ordinal scales” accompanied with ‘0’. It might be contentious whether clinical assessments performed using superficial scales based on the four arithmetic operations could have sufficient validity in clinical treatments or clinical research.

Nevertheless, other clinical scales range over a semi-open continuous interval like ‘0 ≤ X < +∞’ (X: real number), such as ‘blood concentration of white blood cells: [WBC] (/mm3)’, and ‘administration of a certain drug like lithium carbonate: [Li+] (mEq/l), sodium: [Na+] (mEq/l), calcium: [Ca++] (mg/dl), chloride: [Cl-] (mEq/l) and bicarbonate: [HCO3-] (mEq/l)’. Also, there are clinical scales whose ranges are the open interval like ‘-∞ < X < +∞’ (X: real number); ‘Anion gap [AG] = [Na+] - ([Cl-] + [HCO3-]) (reference range for blood tests: 12 ± 2 mEq/l)’ and ‘Base excess [BE] (reference range for blood tests: 0 ± 2 mmol/l)’. However, both can be treated using the notion of ‘field’ because those values are real numbers where all four arithmetic operations are included, with the exception of division by zero. Thus, the above clinical values could be definable over a ‘field’. In this regard, we assume a rule that each unit like ‘mEq/l’ accompanies the value automatically with the results of operations regardless of types of operation among the four arithmetic operations (Note that there are cases when units vanish as when ratios are taken ‘mEq/mEq (unitless)’ or displayed in reciprocal form like ‘l/mEq’). Examples for ‘[WBC] (/mm3)’, ‘[Na+] (mEq/l)’ are presented in Appendix C.

In this case, we consider a set V and assume that ‘#’ means one of ‘addition, subtraction, multiplication, and division’ collectively; thus, ‘Xj # Xk = Xl (V), where ordinal arithmetic calculations are performed excluding of course division by zero.

For set V, addition is commutative: Xj + Xk = Xk + Xj, and associative: (Xj + Xk) + Xl = Xj + (Xk + Xl). As for multiplication, set V meets the conditions of a ‘monoid’ [2, 3]. Associativity: (Xj × Xk) × Xl = Xj × (Xk × Xl), with Left and Right Distributivity: Xj × (Xk + Xl) = Xj × Xk + Xj × Xl, (Xj + Xk) × Xl = Xj × Xl + Xk × Xl. A nonzero Identity X0 (=1) for multiplication exists. The Inverse ‘Xj-1 = 1/Xj’ satisfies ‘Xj × Xj-1 = Xj-1 × Xj = X0 (=1)’. For division, ‘Xj/Xk = Xj × Xk-1 = 1’ is definable except for division by zero. Therefore, we can confirm that set V is a ‘field’. It can be expressed as V = {Xj, #} or V = {Xj | Xj R}.

Furthermore, different fields based on different sets of clinical values can be described as follows: field V1 ≡ {X1j | X1j R (/mm3)} and V1 = {X1j, #} for “blood concentration of white blood cells: [WBC] (/mm3)”, field V2 ≡ {X2j | X2j R (mEq/l)} and V2 = {X2j, #} for “administration of a certain drug like lithium carbonate: [Li+] (mEq/l)”, field V3 ≡ {X3j | X3j R (mEq/l)} and V3 = {X3j, #} for “sodium: [Na+] (mEq/l)”, field V4 ≡ {X4j, #} for calcium: [Ca++] (mg/dl), field V5 for chloride: [Cl-] (mEq/l), field V6 for ‘Anion gap [AG] (mEq/l)’, field V7 for ‘Base excess [BE] (mmol/l)’,…, VN,…, (N: natural number). For each, an independent abstract algebraic treatment is possible as for ordinal abstract algebra.

§2. A vector-like notation using group/field operations belonging to a single set

By making use of all types of scales of measurement, we propose a vector-like expression of a patient’s state (denoted ‘Rj’, j = 1, 2, 3,…: number of sessions), where the mixed expression and its totality of operations that could be performed belong to a single set R. Because of the possible variety of operation rules, the genuine use of this set may be unwieldy at this stage.

Partially based upon our previous description [6, 7], let us define ‘Rj’ to be a vector of five clinical values,

Rj = [severity for depression (within modulo 7 arithmetic) | clock time for the onset of sleep (/24 hrs) | blood concentration of white blood cell [WBC] (/mm3) | blood concentration of [Na+] (mEq/l)| a certain value (a certain operational unit)],
= X j 1 mod 7 | X j 2 / 24 hrs | X j 3 / mm 3 | X j 4 mEq / l | X j 5
(vii)
Next, suppose the patient’s state ‘Rj’ changes to ‘Rj+1’ effected by operator ‘R(j→j+1)’; we denote by ‘’ the binary composition composed of the product of compositions for each component. Three possible states are:
R 1 = X 1 1 = 2 mod 7 | X 1 2 = 21 / 24 hrs | X 1 3 = 5000 / mm 3 | X 1 4 = 145 mEq / l | X 1 5
R 2 = X 2 1 = 5 mod 7 | X 2 2 = 19.5 / 24 hrs | X 2 3 = 18000 / mm 3 | X 2 4 = 128 mEq / l | X 2 5 ,
R 3 = X 3 1 = 3 mod 7 | X 3 2 = 22 / 24 hrs | X 3 3 = 7000 / mm 3 | X 3 4 = 158 mEq / l | X 3 5 .

For the 1st component, ‘X(1)1’,‘X(2)1’, and ‘X(3)1’, modulo 7 arithmetic (addition) is used. For the 2nd components, ‘X(1)2, X(2)2, X(3)2’, operations of Abelian addition are used. For the 3rd component, ‘X(1)3, X(2)3, X(3)3’, 4th ‘X(1)4, X(2)4, X(3)4’, the four arithmetic operators (those operations denoted by ‘#’) are required, and for the 5th, ‘X(1)5, X(2)5, X(3)5’, a certain operational unit is postulated. In the following examples, only addition/subtraction is presented; naturally, multiplication/division is also considered permissible.

With related operators R(1→ 2) = [X(1→ 2)1(mod 7) | X(1→ 2)2(/24 hrs)| X(1→ 2)3(/mm3)| X(1→ 2)4(mEq/l)|X(1→ 2)5(…)], and R(2→ 3) = [X(2→ 3)1(mod 7) | X(2→ 3)2(/24 hrs)| X(2→ 3)3(/mm3)| X(2→ 3)4(mEq/l)|X(2→ 3)5(…)]

Then, using results in Appendix D, ‘R(1→2)’ and ‘R(2→3)’ from the three states given above are as follows:
R ( 1 2 ) = 3 mod 7 | - 1.5 / 24 hrs | 13000 / mm 3 | - 17 mEq / l | X ( 1 2 ) 5
R ( 2 3 ) = 5 mod 7 | 2.5 / 24 hrs | - 11000 / mm 3 | 30 mEq / l | X ( 2 3 ) 5
Thus, we confirm the relation
R 1 R ( 1 2 ) R ( 2 3 ) = R 3
(viii)

Details are illustrated in Appendix E.

Note that, in general, there exists an identity ‘E (=R0) = [0 (mod 7)| 0 (/24 hrs)| 0 (/mm3)| 0 (mEq/l) | X0 (…)]’ such that ‘RjE = ERj = Rj’. Additionally, there exists an inverse for any ‘Rj’, ‘Rj- 1 = [X(j)1- 1(mod 7) | X(j)2- 1(/24 hrs)| X(j)3- 1(/mm3)| X(j)4- 1(mEq/l)|X(j)5- 1(…)] = [7–X(j)1(mod 7) | 24 - X(j)2(/24 hrs)| - X(j)3(/mm3)| - X(j)4(mEq/l)|X(j)5- 1(…)]’ that satisfies ‘Rj-1Rj = RjRj-1 = E’. However, commutativity, ‘RjRk = RkRj’ and associativity, ‘(RjRk)Rl = Rj(RkRl)’ are not satisfied. Here, we assume that operators acting on ‘Rj’s should be performed from left to right, that is, from R1 to Rm (m; number of session for assessment). They should not be applied between ‘Rj’s. For any assortment of ‘Rj’s with scales of measurement among types I)–IV), a single set R = {Rj| X(j)1 × X(j)2 × X(j)3 × X(j)4 × X(j)5} (‘×’ means products among groups and fields) using a vector-like notation for the scoring of patient states can be structured where all possible assessments and/or clinical findings of the patient and treatment are included. The general form is the n-fold product; set R = {Rj| X(j)1 × X(j)2 × X(j)3 × X(j)4 × …×X(j)(n-2) × X(j)(n-1) × X(j)n} (n; the number of components).

As for the possible application to better data mining or data usage from the viewpoint of our reinterpretation, we provide a simple example that may help readers to follow an outline of the argument. Consider an example of 17 states “R1, R2, …, R17” (set R) each with four component (‘n = 5’) and arrows (only symbols) that indicate the possible changes among the ‘Rj’s, as displayed in Figure 1. The scheme covers the notation of our model, and also that of existing methods where (possible) results of data, ‘Rj’s, are not combined directly with each other in the sense of operations. Then, the arrows could be re-displayed according to our concepts as operators ‘R(j→k) that can be regarded as elements ‘Rj’ belonging to a set R as in Figure 2. In ordinal data sets, the ‘Rj’s are merely a collection of values and the arrows in Figure 1 are only marks. However, in our interpretation, all ‘Rj’s and ‘R(j→k)’s are elements of a single set R subject to axioms of an abstract algebra as indicated using composition symbol ‘’ in Figure 3. There, the changes ‘from Rj to Rk’ can be traced at each session. Displayed in this way, Figure 3 represents an “operational tree” that could offer potential for better data mining/data usage through a more generalized/concise treatment (e.g., withdrawing/recording correspond to schemes in Figure 3) that might be permissible. Practical improvements for efficacy, however, will need future investigations.
Figure 1

Example of a tree composed of a data set and ordinal arrows. The tree represents changes of states between 17 ‘Rj’s data elements. In ordinal existing methods, data are only a collection of results that are not directly combined; manipulations of parts of the data are defined separately. The arrows merely indicate a change from one state ‘Rj’ to another ‘Rk’ and have no specific operational sense.

Figure 2

Scheme of the tree with arrows labelled by operators. Arrows are interpreted as operators ‘R(j→k)’s that could be regarded as ordinal elements ‘Rj’s belonging to a single set R. Each operator that changes ‘Rj to Rk’ can be traced and its degree for each session identified from initial and final states. The final states (R6, R9, R16 and R17) can be traced back to any initiating state ‘R1’ by performing an appropriate sequence of ‘R(j→k)’s.

Figure 3

“Operational tree”—the compositional scheme using symbol ‘’. The tree of Figure 2 is re-illustrated using composition symbol ‘’, where the operators are assumed to belong to the single set R. By admitting algebraic correspondences, this compositional scheme could potentially provide better data mining/data usage.

Here, consider the scenario of Figure 1 where from an initial value ‘R1’ there are four outcomes ‘R6’, ‘R9’, ‘R16’, and ‘R17’ containing nodes at ‘R2’ ‘R4’ ‘R10’ ‘R12’ and ‘R13’. By making use of our previous examples ‘R1 - R3’, the next simplest examples with ‘n (component number) = 5’ can be confirmed easily:
R 10 = 0 mod 7 | 17 / 24 hrs | 9000 / mm 3 | 130 mEq / l | X 10 5
R 11 = 6 mod 7 | 20 / 24 hrs | 20000 / mm 3 | 149 mEq / l | X 11 5
R 12 = 4 mod 7 | 23 / 24 hrs | 6000 / mm 3 | 140 mEq / l | X 12 5
R 13 = 1 mod 7 | 18 / 24 hrs | 5000 / mm 3 | 135 mEq / l | X 13 5
R 17 = 2 mod 7 | 23.5 / 24 hrs | 3000 / mm 3 | 150 mEq / l | X 17 5
Following these results, the next relations, according to the tree in Figure 3, can be obtained for instance:
R 1 R ( 1 2 ) R ( 2 10 ) R ( 10 11 ) R ( 11 12 ) = R 12
R 1 R ( 1 2 ) R ( 2 10 ) R ( 10 13 ) R ( 13 17 ) = R 17

The operator expressions are evaluated in Appendix F.

Similarly, the next sequences are definable in principle,
R 1 R 1 2 R 2 3 R 3 4 R 4 5 R 5 6 = R 6
R 1 R 1 7 R 7 8 R 8 9 = R 9
R 12 R 12 4 = R 4
R 1 R 1 2 R 2 10 R 10 13 R 13 14 R 14 15 R 15 16 = R 16
In general, we denote a node divergence ‘Ra to Rb (=RaR(a→b) = Rb)’ and ‘Ra to Rc (=RaR(a→c) = Rc)’ as ‘Ra[(R(a→b))(R(a→c))]’ (a,b,c: non-negative integers); here ‘( )( )( )…’ meaning simple juxtaposition. All paths belonging to the operational tree of Figure 3 can then be described/recorded, for instance, as the sequence
R 1 R 1 2 R 2 3 R 3 4 R 4 5 R 5 6 R 2 7 R 7 8 R 8 9 ( R 2 10 R 10 11 R 11 12 R 12 4 R 10 13 R 13 14 R 14 15 R 15 16 R 13 17 )
(ix)
To display for easy recognition, for example, end states like ‘R6, R9, R16, and R17’ and divergence point ‘R4’ a notation ‘(=R6), (=R9), (=R16) and (=R17)’ might be considered. Hence,
R 1 R 1 2 R 2 3 R 3 4 R 4 5 R 5 6 = R 6 R 2 7 R 7 8 R 8 9 = R 9 ( R 2 10 R 10 11 R 11 12 R 12 4 = R 4 R 10 13 R 13 14 R 14 15 R 15 16 = R 16 R ( 13 17 ) = R 17 )
(x)
Moreover, composition with an operator as in operating on ‘R(3 → 4)[(R(4 → 5) R(5 → 6))(R(4 → 8) R(8 → 9) R(9 → 10))(R(4 → 15) R(15 → 16)) …] from the left-hand side by ‘R3’. The subsequent result can be expressed in accordance with the single scheme presented in Figure 3,
R 3 R 3 4 R 4 5 R 5 6 R 4 8 R 8 9 R 9 10 R 4 15 R 15 16 = R 6 , R 10 , R 16 ,…
(xi)

Note that the above descriptions (ix)–(xi) express one-to-many functionality. However, we think that these formulae are the algebra equivalent to the single operational tree as exemplified by Figure 3. These play the algebraic role in composite record-keeping in applied fields such as medicine. In this formalism, any possible result ‘Rj’ (set R) is obtained and traceable from any state ‘Rk’ under operations involving a plurality of elements belonging to a single set R.

Additionally, we can include data mining in a more symbolic/abstract way as follows. For an arbitrary j (j = 1, 2, 3,…, m), a hierarchical-cluster-like expression can be defined [12]. For instance, if a partition of Rj is a set of subsets H = {1Rj, 2Rj, 3Rj,…, rRj} such that (1) Rj H; (2) for all single sets sRj in Rj, sRj H; and (3) ‘sRjtRj {ϕ, sRj, tRj}’ for all s ≠ t = 1, 2,…, r. That is, condition (3) means that either any two clusters ‘sRj and tRj’ are disjoint, or one cluster is contained entirely inside the other, and every individual Rj is contained in at least one cluster larger than itself. Note that if ‘sRjtRj = ϕ’ for all s ≠ t, then the hierarchy becomes a partitioning. Henceforth, reference to a hierarchy implies that ‘sRjtRj = ϕ’ for at least one set of (s, t) values. In the previous example (vii), ‘Rj’ could be expressed in hierarchical-cluster notation where there are eight clusters (and relabeling within clusters) as shown in Figure 4. If Rj comprises ‘1Rj1 and 2Rj1’, the first level of hierarchy, ‘Rj = 1Rj1 2Rj1’ holds. At the second level, ‘1Rj1 = 11Rj2 12Rj2’ = [X(j)1 (mod 7) | X(j)2 (/24 hrs)], ‘2Rj1 = 21Rj2 22Rj2’ = [X(j)3 (/mm3) | X(j)4 (mEq/l)|X(j)5 (…)], whereas at the third level, ‘22Rj2 = 221Rj3 222Rj3’ = [X(j)4 (mEq/l)|X(j)5 (…)], 221Rj3 = [X(j)4 (mEq/l)], 222Rj3 = [X(j)5 (…)] (Figure 4). Hence we obtain the complete set Rj = {X(j)1, X(j)2, X(j)3, X(j)4, X(j)5} = [X(j)1 (mod 7) | X(j)2 (/24 hrs) | X(j)3 (/mm3) | X(j)4 (mEq/l) | X(j)5 (…)]. A hierarchy has additional levels as necessary to reach single units at its base [12]. The top level is the entire dataset ‘Rj’ and that is always composable using base units. That is, arbitrary ‘Rk’ and ‘R1’ can be combined into a single dataset as with ‘Rk = [X(k)1| X(k)2 |…| X(k)a]’ and ‘R1 = [X(1)1| X(1)2 |…| X(1)b]’, ‘{Rk, R1} =Rj [X(j)1| X(j)2 |…| X(j)a | X(j)a+1| X(j)a+2 |…| X(j)a+b] ’ (a,b; positive integers). In this way, classical datasets that are classified in the Stevens scales of measurement could be mined and combined on a higher abstract structure level. To help better understand the concept, a sequence of schemes illustrating the principles of our model is presented in Figure 5.
Figure 4

Systematization of “hierarchical clusters”. A hierarchical cluster is defined as the necessary class of subsets needed to decompose the set to single units. The top level is the entire dataset and that is always decomposable into base units.

Figure 5

Schemes for data mining and combination in some higher abstract structure level. First, the classical dataset, classified by the four types of scales of measurement (Stevens classification), is re-interpreted as a group/field-like operational structure. Second, a vector ‘Rj’ is defined that is composed as the product of each type of operation for all datasets other than for those classified as “Nominal scale”. Third, the ‘Rj’s are constituted as an “operational tree”. Fourth, ‘Rj’ with any arbitrary j (j = 1, 2,…, m) could be mined and combined as “hierarchical clusters”.

Subject to future improvements, we envisage that this compact description is versatile to provide better data mining/data usage than from existing methods, although a final version is far from complete at this early stage.

§3. Supplementary suggestions and limitations

If the four arithmetic operations are appropriate in handling the values from clinical assessments, representation by “Ratio scales” (in some cases, the “modular scale” with suitable modulo number previously mentioned) might be effective in describing the clinical treatments or studies. The “Numerical rating scale (NRS)” with range ‘0–10’ [13, 14] illustrates the point where the modulo 11 additive group ‘Z11’ arises as a natural modular scale. In contrast, similar approaches might be difficult for a “visual analogue scale” [15, 16] where values could take any real number.

Whereas rating scales systemized as abstract algebra-like form may enable a more generalized/sophisticated understanding, establishing a link between fields of clinical medicine and abstract algebra, and mixed states and operators in vector-like notation as in (vii)–(xi), does not always assure more concise manipulations. A mixed treatment as exemplified in (vii)–(xi) might not always yield optimal results at present. In general, combining group and field-like structures within ‘Rj’ may cause some confusion in handling the ‘Rj’s although benefits accrue through operational compliance and convenience in dealing with the abstract algebra. For description and records, a vector-like definition ‘Rj’ may not always be advantageous in which only the four types I)–IV) are used (particularly for ‘I)’, the “nominal scale”, where systematization of operation seems to be impossible). Nevertheless, we infer that in the handling of operations in mixed-notation like ‘Rj’, the classification and synthesis of scales of measurement in some group/field-like form may be devised in a more rigorous methodology in future improvements.

That apart, similar, redundant, and obscure components may have been incorporated into the ‘Rj’s description without discretion. The ‘Rj’ in such instances loses validity and versatility in terms of a concise composition of scales. This is considered to result from the fact that a total state of a certain disease or a condition of a patient is not always composable or describable via the combination of partial components. This implies that a larger number of components is not always desirable for assessment or rating scales.

Unfortunately, almost all current assessment scales in medicine are handled as if they were ratio scales although almost all are just ordinal scales. That might introduce considerable futility and/or waste of scientific resources. As previously indicated, some clinical scales (e.g., TNM classification) should be represented as an ordinal scale accompanied by ‘0’ with no absolute need for a quantitative calibration (modular scale). Although a combination composed of entirely ratio scales seems to be difficult or impossible, we believe at least that appropriate operational structures (e.g., group, field) should always be selected that satisfied the conditions in instances like composition of scale, analysis, and interpretation of the results. These structures must be recognized clearly by users per each assessment to avoid misestimation, overconfidence, and complacency in scales.

Conclusions

The Stevens classification of scales of measurement can be re-interpreted and modelled as some abstract algebra-like systematization. Moreover, a vector-like notation using mixed types of operations and a hierarchical structure-like systematization are possible enabling a sophisticated means to classify, update, monitor, and forecast patient treatments. Better data mining/data usage and efficacy is expected and will be considered in future studies.

Appendix

Appendix A

Using ‘N = 5’ for the scale of a certain symptom or clinical finding with set Z5 ≡{0, 1, 2, 3, 4}, we suppose ‘X1 = 1’ ( set Z5) for the initial state and ‘X2 = 4’ ( set Z5) for the final state. Expressed as ‘X1*X(1→2) = X2’, the change can be determined as ‘X(1→2) = X2 – X1 (mod 5) = 4 – 1 (mod 5) = 3 (mod 5) ( Z5)’.

Appendix B

Suppose ‘the body-temperature thermometer’ (deg C; degree Celsius) changes from ‘T1 = 36.7 (deg C)’ to ‘T2 = 35.1 (deg C)’. Because ‘T1 ◦T(1→2) = T2’, an operator part is calculated as ‘T(1→2) = T2 - T1 = 35.1 - 36.7 (deg C) = - 1.6 (deg C)’. For an another example, when there are two clock times for the onset of sleep ‘t1 = 21 (/24 hrs)’ and ‘t2 = 19.5 (/24 hrs)’, the operator part is determined as ‘t(1→2) (/24 hrs) = t2 - t1 (/24 hrs) = 19.5 - 21 (/24 hrs) = -1.5 (/24 hrs) = 24 -1.5 (/24 hrs) = 22.5 (/24 hrs)’.

Appendix C

Provided [WBC] changes in the following manner: ‘5000 (/mm3) (= W1) →18000 (/mm3) (=W2). Because ‘W1 # W(1→2) = W2’, the operator denoted by ‘W(1→2)’ for addition is derived from ‘W(1→2) = W2 - W1 = 18000 - 5000 = 13000 (/mm3)’. Collectively, the operator is determined by division: ‘W(1→2) = W2/W1 =18000/5000 (= 3.6) (/mm3)’,

For an another example, if ‘[Na]1 = 145 (mEq/l)’ changes into ‘[Na]2 = 128 (mEq/l)’, because ‘[Na]1 # [Na](1→2) = [Na]2’, the operator for addition is obtain from ‘[Na](1→2) = [Na]2 - [Na]1 = 128 - 145 = - 17 (mEq/l)’. Collectively, the operator for division is ‘[Na](1→2) = [Na]2/[Na]1 = 128/145 (mEq/l)’.

Appendix D

R(1→2) = R2 - R1

= [5 (mod 7) | 19.5 (/24 hrs) | 18000 (/mm3) | 128 (mEq/l) | X(2)5 (…)] - [2 (mod 7) | 21 (/24 hrs) | 5000 (/mm3) | 145 (mEq/l) | X(1)5 (…)],

= [5 - 2 (mod 7) | 19.5 - 21 (/24 hrs) | 18000 - 5000 (/mm3) | 128 - 145 (mEq/l) | X(1→2)5 (…)],

= [3 (mod 7) | - 1.5 (/24 hrs) | 13000 (/mm3) | - 17 (mEq/l) | X(1→2)5 (…)].

R(2→3) = R3 - R2

= [3 (mod 7) | 22 (/24 hrs) | 7000 (/mm3) | 158 (mEq/l)] | X(3)5 (…)] - [5 (mod 7) | 19.5 (/24 hrs) | 18000 (/mm3) | 128 (mEq/l) | X(2)5 (…)],

= [3 - 5 (mod 7) | 22 - 19.5 (/24 hrs) | 7000 - 18000 (/mm3) | 158 - 128 (mEq/l) | X(2→3)5 (…)],

= [- 2 (mod 7) | 2.5 (/24 hrs) | - 11000 (/mm3) | 30 (mEq/l) | X(2→3)5 (…)],

= [5 (mod 7) | 2.5 (/24 hrs) | - 11000 (/mm3) | 30 (mEq/l) | X(2→3)5 (…)].

Appendix E

R1R(1→2)R(2→3) = [2 (mod 7) | 21 (/24 hrs) | 5000 (/mm3) | 145 (mEq/l) | X(1)5 (…)][3 (mod 7) | - 1.5 (/24 hrs) | 13000 (/mm3) | - 17 (mEq/l) | X(1→2)5 (…)][5 (mod 7) | 2.5 (/24 hrs) | - 11000 (/mm3) | 30 (mEq/l) | X(2→3)5 (…)],

= [2 + 3 + 5 (mod 7) | 21 - 1.5 + 2.5 (/24 hrs) | 5000 + 13000 - 11000 (/mm3) | 145 - 17 +30 (mEq/l) | X(3)5 (…)],

= [10 (mod 7) | 22 (/24 hrs) | 7000 (/mm3) | 158 (mEq/l) | X(3)5 (…)],

= [3 (mod 7) | 22 (/24 hrs) | 7000 (/mm3) | 158 (mEq/l) | X(3)5 (…)].

Appendix F

For the 3rd and 4th components, only addition/subtraction is demonstrated collectively for ease in comprehension.

R(2→10) = R10 - R2 = [0 - 5 (mod 7) | 17 - 19.5 (/24 hrs) | 9000 - 18000 (/mm3) | 130 - 128 (mEq/l) | X(2→10)5 (…)] = [- 5 (mod 7) | - 2.5 (/24 hrs) | - 9000 (/mm3) | 2 (mEq/l) | X(2→10)5 (…)],

R(10→11) = R11 - R10 = [6 - 0 (mod 7) | 20 - 17 (/24 hrs) | 20000 - 9000 (/mm3) | 149 - 130 (mEq/l) | X(10→11)5 (…)] = [6 (mod 7) | 3 (/24 hrs) | 11000 (/mm3) | 19 (mEq/l) | X(10→11)5 (…)],

R(11→12) = R12 - R11 = [4 - 6 (mod 7) | 23 - 20 (/24 hrs) | 6000 - 20000 (/mm3) | 140 - 149 (mEq/l) | X(11→12)5 (…)] = [- 2 (mod 7) | 3 (/24 hrs) | - 14000 (/mm3) | - 9 (mEq/l) | X(11→12)5 (…)],

R(10→13) = R13 - R10 = [1 - 0 (mod 7) | 18 - 17 (/24 hrs) | 5000 - 9000 (/mm3) | 135 - 130 (mEq/l) | X(10→13)5 (…)] = [1 (mod 7) | 1 (/24 hrs) | - 4000 (/mm3) | 5 (mEq/l) | X(10→13)5 (…)],

R(13→17) = R17 - R13 = [2 - 1 (mod 7) | 23.5 - 18 (/24 hrs) | 3000 - 5000 (/mm3) | 150 - 135 (mEq/l) | X(13→17)5 (…)] = [1 (mod 7) | 5.5 (/24 hrs) | - 2000 (/mm3) | 15 (mEq/l) | X(13→17)5 (…)].

Declarations

Acknowledgements

The authors wish to acknowledge Katsuji Nishimura, Kaoru Sakamoto, Takashi Oshimo, and Keiko Kojo for providing us with very useful advice.

Authors’ Affiliations

(1)
Department of Psychiatry, Tokyo Women’s Medical University
(2)
Depression Prevention Medical Center, Inariyama Takeda Hospital

References

  1. Stevens SS: On the theory of scales of measurement. Science. 1946, 103 (2684): 677-680. 10.1126/science.103.2684.677.View ArticleGoogle Scholar
  2. Judson TW: Abstract Algebra: Theory and Applications. 1997, Virginia: PWS Publishing CompanyGoogle Scholar
  3. Hungerford TW: Abstract Algebra, An Introduction. 1997, Philadelphia: Saunders College Publishing, 2Google Scholar
  4. Hayes MHS, Patterson DG: Experimental development of the graphic rating method. Psychol Bull. 1921, 18: 98-99.Google Scholar
  5. Freyd M: The graphic rating scale. J Educ Psychol. 1923, 14 (2): 83-102.View ArticleGoogle Scholar
  6. Sawamura J, Morishita S, Ishigooka J: A group-theoretical notation for disease states: an example using the psychiatric rating scale. Theor Biol Med Model. 2012, 9: 28-10.1186/1742-4682-9-28. July 9PubMed CentralView ArticlePubMedGoogle Scholar
  7. Sawamura J, Morishita S, Ishigooka J: Further suggestions on the group-theoretical approach using clinical values. Theor Biol Med Model. 2012, 9: 54-10.1186/1742-4682-9-54. Dec 19PubMed CentralView ArticlePubMedGoogle Scholar
  8. Tate J, Oort F: Group schemes of prime order. Ann Scient Éc Norm Sup. 1970, 3 (1): 1-21. 4e série, t.3Google Scholar
  9. Jullien GA: Implementation of multiplication, modulo a prime number, with applications to number theoretic transforms. IEEE Transac Comput. 1980, C-29: 899-905.View ArticleGoogle Scholar
  10. Sobin LH, Gospodarowicz MK, Wittekind C: International Union Against Cancer (UICC), TNM classification of malignant tumours. 2010, New York: Wiley-Liss, 7Google Scholar
  11. WHO: International Statistical Classification of Diseases and Related Health Problems. 10th Revision. 1992, Geneva, Switzerland: World Health OrganizationGoogle Scholar
  12. Billard L, Diday E: Symbolic data analysis, in 'Conceptual statistics and data mining'. 2006, England: Wiley & Sons LtdView ArticleGoogle Scholar
  13. Turk DC, Rudy TE, Sorkin BA: Neglected topics in chronic pain treatment outcome studies: determination of success. Pain. 1993, 53 (1): 3-16. 10.1016/0304-3959(93)90049-U.View ArticlePubMedGoogle Scholar
  14. Farrar JT, Young JP, LaMoreaux L, Werth JL, Poole RM: Clinical importance of changes in chronic pain intensity measured on an 11-point numerical pain rating scale. Pain. 2001, 94 (2): 149-158. 10.1016/S0304-3959(01)00349-9.View ArticlePubMedGoogle Scholar
  15. Crichton N: Information point: visual analogue scale (VAS). J Clin Nurs. 2001, 10 (5): 697-706. 10.1046/j.1365-2702.2001.00525.x.View ArticleGoogle Scholar
  16. Langley GB, Sheppard H: The visual analogue scale: Its use in pain measurement. Rheumatol Int. 1985, 5 (4): 145-148. 10.1007/BF00541514.View ArticlePubMedGoogle Scholar

Copyright

© Sawamura et al.; licensee BioMed Central Ltd. 2014

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Advertisement