In mathematics, a multiset (or bag) is a generalization of a set. While each member of a set has only one membership, a member of a multiset can have more than one membership (meaning that there may be multiple instances of a member in a multiset, not that a single member instance may appear simultaneously in several multisets). The term "multiset" was coined by Nicolaas Govert de Bruijn in the 1970s.[1] The use of multisets in mathematics predates the name "multiset" by nearly 90 years. Richard Dedekind used multisets in a paper published in 1888.[
Overview
The total number of elements in a multiset, including repeated memberships, is the cardinality of the multiset. The number of times an element belongs to the multiset is the multiplicity of that member. For example, in the multiset {a, a, b, b, b, c} the multiplicities of the members a, b, and c are respectively 2, 3, and 1, and the cardinality of the multiset is 6. To distinguish between sets and multisets, a notation that incorporates brackets is sometimes used: the multiset {2,2,3} can be represented as [2,2,3].[3] In multisets, as in sets and in contrast to tuples, the order of elements is irrelevant: The multisets {a, b} and {b, a} are equal.
Formal definition
Within set theory, a multiset can be formally defined as a 2-tuple(A, m) where A is some set and m : A → N≥1 is a function from A to the set N≥1 = {1, 2, 3, ...} of positive natural numbers. The set A is called the underlying set of elements. For each a in A the multiplicity (that is, number of occurrences) of a is the number m(a). If a universe U in which the elements of A must live is specified, the definition can be simplified to just a multiplicity function mU : U → N from U to the set N = {0, 1, 2, 3, ...} of natural numbers, obtained by extending m to U with values 0 outside A. This extended multiplicity function is the multiplicity function called 1A below. Like any function, the function m may be defined as its graph: the set of ordered pairs { (a, m(a)) : a in A }. With these definitions the multiset written as { a, a, b } is defined as ({ a, b },{ (a, 2), (b, 1) }), and the multiset { a, b } is defined as ({ a, b },{ (a, 1), (b, 1) }).
The concept of a multiset is a generalization of the concept of a set. A multiset corresponds to an ordinary set if the multiplicity of every element is one (as opposed to some larger natural number). However to replace set theory by "multiset theory" so as to have multisets directly into the foundations is not easy: a privileged role would still have to be given to (ordinary) sets when defining maps, as there is no clear notion of maps (functions) between multisets. It can be done[not specific enough to verify], with the result that classical theorems such as the Cantor–Bernstein–Schroeder theorem or Cantor's theorem, when generalized to multisets, are false[examples needed]; they remain true only in the case of finite multisets[citation needed]. In addition, the notion of a set as a "class of items satisfying a certain property" – i.e. the extension of a predicate – is used throughout mathematics, and this notion lacks a sensible generalization to multisets with multiple memberships.
One advantage of treating multisets in their own right (as primitive, rather than defined in terms of something else) is that one can talk naturally about multiplicity 0 without having to admit that multisets are infinite – in the classical sense – since multisets have automatically their own notion of size.
An indexed family, ( ai ), where i is in some index-set, may define a multiset, sometimes written { ai }, in which the multiplicity of any element x is the number of indices i such that ai = x. The condition for this to be possible is that no element occurs infinitely many times in the family: even in an infinite multiset, the multiplicities must be finite numbers.
Multiplicity function
The set indicator function of a normal set is generalized to the multiplicity function for multisets. The set indicator function of a subset A of a set X is the function
defined by
The set indicator function of the intersection of sets is the minimum function of the indicator functions
The set indicator function of the union of sets is the maximum function of the indicator functions
The set indicator function of a subset is smaller than or equal to that of the superset
The set indicator function of a cartesian product is the product of the indicator functions of the cartesian factors
The cardinality of a (finite) set is the sum of the indicator function values
Now generalize the concept of set indicator function by releasing the constraint that the values are 0 and 1 only and allow the values 2, 3, 4 and so on. The resulting function is called a multiplicity function and such a function defines a multiset. The concepts of intersection, union, subset, cartesian product and cardinality of multisets are defined by the above formulas.
The multiplicity function of a multiset sum, is the sum of the multiplicity functions
The multiplicity function of a multiset difference is the zero-truncated subtraction of the multiplicity functions
The scalar multiplication of a multiset by a natural number n may be defined as:
A small finite multiset, A, is represented by a list where each element, x, occurs as many times as the multiplicity, 1A(x), indicates.
Examples
One of the simplest and most natural examples is the multiset of prime factors of a number n. Here the underlying set of elements is the set of prime divisors of n. For example the number 120 has the prime factorization
which gives the multiset {2, 2, 2, 3, 5}.
A related example is the multiset of solutions of an algebraic equation. A quadratic equation, for example, has two solutions. However, in some cases they are both the same number. Thus the multiset of solutions of the equation could be { 3, 5 }, or it could be { 4, 4 }. In the latter case it has a solution of multiplicity 2.
Free commutative monoids
The free commutative monoid on a set X (see free object) can be taken to be the set of finite multisets with elements drawn from X, with the monoid operation being multiset sum and the empty multiset as identity element. Such monoids are also known as (finite) formal sums of elements of X with natural coefficents. The free commutative semigroup is the subset of the free commutative monoid which contains all multisets with elements drawn from X except the empty multiset.
Free abelian groups are formal sums (i.e. linear combinations) of elements of X with integer coefficients. Equivalently, they may be seen as signed finite multisets with elements drawn from X.
Multiset coefficients
The number of multisets of cardinality k, with elements taken from a finite set of cardinality n, is called the multiset coefficient (the notation is meant to resemble that of binomial coefficients, but should not be confused with the unrelated multinomial coefficients; it is used for instance in (Stanley, 1997), and could be pronounced "n multichoose k" to resemble "n choose k" for ). Its value can be given explicitly as
where the last two expressions express the multiset coefficient in two ways as a binomial coefficient. So, the number of such multisets is the same as the number of subsets of cardinality k in a set of cardinality n + k − 1.
There are for example 4 multisets of cardinality 3 with elements taken from the set {1,2} of cardinality 2 (n=2, k=3), namely : {1,1,1}, {1,1,2}, {1,2,2}, {2,2,2}. And there are also 4 subsets of cardinality 3 in the set {1,2,3,4} of cardinality 4 (n+k-1 = 4), namely : {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, {2, 3, 4}.
One simple way to prove this involves representing multisets in the following way. First, consider the notation for multisets that would represent { a, a, a, a, a, a, b, b, c, c, c, d, d, d, d, d, d, d } (6 as, 2 bs, 3 cs, 7 ds) in this form:
This is a multiset of cardinality 18 made of elements of a set of cardinality 4. The number of characters including both dots and vertical lines used in this notation is 18 + 4 − 1. The number of vertical lines is 4 − 1. The number of multisets of cardinality 18 is then the number of ways to arrange the 4 − 1 vertical lines among the 18 + 4 − 1 characters, and is thus the number of subsets of cardinality 4 − 1 in a set of cardinality 18 + 4 − 1. Equivalently, it is the number of ways to arrange the 18 dots among the 18 + 4 − 1 characters, which is the number of subsets of cardinality 18 of a set of cardinality 18 + 4 − 1. This is
so that is the value of the multiset coefficient
One may define a generalized binomial coefficient
in which n is not required to be a nonnegative integer, but may be negative or a non-integer, or a non-real complex number. (If k = 0, then the value of this coefficient is 1 because it is the empty product.) Then the number of multisets of cardinality k in a set of cardinality n is
This fact led Gian-Carlo Rota to ask "Why are negative sets multisets?".[citation needed] He considered that question worthy of the attention of philosophers of mathematics.
Recurrence relation
A recurrence relation for multiset coefficients may be given as
with
The above recurrence may be interpreted as follows. Let [n] := {1, ..., n} be the source set. There is always exactly one (empty) multiset of size 0, and if n = 0 there are no larger multisets, which gives the initial conditions.
Now, consider the case in which n,k > 0. A multiset of cardinality k with elements from [n] might or might not contain any instance of the final element n. If it does appear, then by removing n once, one is left with a multiset of cardinality k − 1 of elements from [n], and every such multiset can arise, which gives a total of
possibilities.
If n does not appear, then our original multiset is equal to a multiset of cardinality k with elements from [n − 1], of which there are
Thus,
Polynomial notation
The set {x} may be represented by the monomial x. The set of subsets, { {}, {x} } , is represented by the binomial 1 + x.
The set {x,y} may be represented by the monomial x·y. The set of subsets, { {}, {x}, {y}, {x,y} } , is represented by the polynomial
(1 + x)·(1 + y) = 1 + x + y + x·y.
The multiset {x,x} may be represented by the monomial x·x = x2. The multiset of submultisets, { {}, {x}, {x}, {x,x} }, is represented by the polynomial
(1 + x)2 = 1 + x + x + x·x = 1 + 2·x + x2.
The multiset of submultisets of the multiset xn is
That is why the binomial coefficient counts the number of k-combinations of an n-set.
The multiset xK·yN−K , containing N elements, K of which are hits, is called a statistical population, and a submultiset is called a statistical sample. The set of samples is
(1 + x)K·(1 + y)N−K
which by the binomial theorem equals
So the number of n-samples with k hits is
See hypergeometric distribution and inferential statistics for further on the distribution of hits.
The infinite set of finite multisets of elements taken from the set {x}:
{ {}, {x}, {x,x}, {x,x,x}, ... }
may be represented by the formal power series
S = 1 + x + x2 + x3 + ... = 1 + xS .
The formal solution,
S = (1 − x)−1,
makes sense as a set of multisets, but the intermediate result, 1−x, does not make sense as a set of multisets.
The infinite set of finite multisets of elements taken from the set x·y is
The special case y=x : The infinite multiset of finite multisets of elements taken from the multiset x2 is
(1 − x)−2 = 1 + 2·x + 3·x2 + ...
The general case: The infinite multiset of finite multisets of elements taken from the multiset xn is
.
This explains why "multisets are negative sets". The negative binomial coefficients count the number of k-multisets of elements taken from an n-set.
Cumulant generating function
A non-negative integer, n, can be represented by the monomial xn . A finite multiset of non-negative integers, say {2, 2, 2, 3, 5}, can likewise be represented by a polynomial f(x), say f(x) = 3·x2 + x3 + x5 .
It is convenient to consider the cumulant generating functiong(t) = log(f(et)), say g(t) = log(3·e2·t + e3·t + e5·t) .
The cardinality of the multiset is eg(0) = f(1), say 3 + 1 + 1 = 5.
The derivative g is g '(t) = f(et)−1·f '(et)·et, say g '(t) = (3·e2·t + e3·t + e5·t)−1·(6·e2·t + 3·e3·t + 5·e5·t) .
The mean value of the multiset is μ = g '(0) = f(1)−1·f '(1), say μ = (3+1+1)−1·(6+3+5) = 2.8 .
The variance of the multiset is σ2 = g ' '(0) .
The numbers ( μ, σ2, ··· ) = ( g '(0), g ' '(0), ··· ) are called cumulants.
The infinite set of non-negative integers {0, 1, 2, ···} is represented by the formal power series 1 + x + x2 + ··· = (1 − x)−1. The mean value and standard deviation are undefined. Nevertheless it has a cumulant-generating function, g(t) = −log(1−et). The derivative of this cumulant-generating function is g '(t) = (e−t−1)−1.
A finite multiset of real numbers , A = { Ai }, is represented by the cumulant generating function
This representation is unique: different multisets have different cumulant generating functions. See partition function (statistical mechanics) for the case where the numbers in question are the energy levels of a physical system.
The cumulant-generating function of a multiset of n real numbers having mean μ and standard deviation σ is: g(t) = log(n) + μ·t + 2−1·(σ·t)2 + ··· , and the derivative is simply: g '(t) = μ + σ2·t + ···
The cumulant-generating function of set, {k}, consisting of a single real number, k, is g(t) = k·t , and the derivative is the number itself: g '(t) = k . So the concept of the derivative of the cumulant generating function of a multiset of real numbers is a generalization of the concept of a real number.
The cumulant-generating function of a constant multiset, {k, k, k, k, ··· , k} of n elements all equal to the same real number k, is g(t) = log(n)+k·t , and the derivative is the number itself: g '(t) = k , irrespective of n.
The cumulant-generating function of the multiset of sums of elements of two multisets of numbers is the sum of the two cumulant-generating functions:
There is unfortunately no general formula for computing the cumulant generating function of a product
but the special case of a constant times a multiset of numbers is:
The multiset 2·A = {2·Ai} is not the same multiset as 2×A =A+A = {Ai+Aj}. For example, 2·{+1,−1} = {+2,−2} while 2×{+1,−1} = {+1,−1} + {+1,−1} = {+1+1,+1−1,−1+1,−1−1} = {+2,0,0,−2}.
The standard normal distribution is like a limit of big multisets of numbers.
This limit does not make sense as a multiset of numbers, but the derivative of the cumulant generating functions of the multisets in question makes sense, and the limit is well defined.
The constant term k2·log(2) vanishes by differentiation. The terms ··· vanish in the limit. So for the standard normal distribution, having mean 0 and standard deviation 1, the derivative of the cumulant generating function is simply g '(t) = t . For the normal distribution having mean μ and standard deviation σ, the derivative of the cumulant generating function is g '(t) = μ + σ2·t .
In mathematics, a multigraph or pseudograph is a graph which is permitted to have multiple edges, (also called "parallel edges"[1]), that is, edges that have the same end nodes. Thus two vertices may be connected by more than one edge. Formally, a multigraph G is an ordered pair G:=(V, E) with
V a set of vertices or nodes,
E a multiset of unordered pairs of vertices, called edges or lines.
Multigraphs might be used to model the possible flight connections offered by an airline. In this case the multigraph would be a directed graph with pairs of directed parallel edges connecting cities to show that it is possible to fly both to and from these locations.
Some authors also allow multigraphs to have loops, that is, an edge that connects a vertex to itself,[2] while others call these pseudographs, reserving the term multigraph for the case with no loops.[3]
A multidigraph is a directed graph which is permitted to have multiple arcs, i.e., arcs with the same source and target nodes. A multidigraph G is an ordered pair G:=(V,A) with
V a set of vertices or nodes,
A a multiset of ordered pairs of vertices called directed edges, arcs or arrows.
In category theory a small category can be defined as a multidigraph equipped with an associative composition law and a distinguished self-loop at each vertex serving as the left and right identity for composition. For this reason, in category theory the term graph is standardly taken to mean "multidigraph", and the underlying multidigraph of a category is called its underlying digraph.
A mixed multigraphG:=(V,E, A) may be defined in the same way as a mixed graph.
Labeling
Multigraphs and multidigraphs also support the notion of graph labeling, in a similar way. However there is no unity in terminology in this case.
The definitions of labeled multigraphs and labeled multidigraphs are similar, and we define only the latter ones here.
Definition 1: A labeled multidigraph is a labeled graph with labeled arcs.
Formally: A labeled multidigraph G is a multigraph with labeled nodes and arcs. Formally it is an 8-tuple where
V is a set of nodes and A is a multiset of arcs.
ΣV and ΣA are finite alphabets of the available node and arc labels,
and are two maps indicating the source and target node of an arc,
and are two maps describing the labeling of the nodes and edges.
Definition 2: A labeled multidigraph is a labeled graph with multiple labeled edges, i.e. edges with the same end nodes and the same edge label (note that this notion of a labeled graph is different from the notion given by the article graph labeling).
In mathematics, an ordered pair is a combination of two objects, most often coordinates (or entries or projections), in which the first (the first coordinate or first entry or left projection) is distinguished from the second (the second coordinate or second entry or right projection). If the first coordinate is a and the second is b, the usual notation for an ordered pair is (a, b). The pair is "ordered" in that (a, b) differs from (b, a) unless a = b.
Cartesian products and binary relations (and hence the ubiquitous functions) are defined in terms of ordered pairs.
Generalities
Let (a1,b1) and (a2,b2) be two ordered pairs. Then the characteristic (or defining) property of the ordered pair is:
The entries of an ordered pair can be other ordered pairs, enabling the recursive definition of ordered n-tuples (ordered lists of n terms). For example, the ordered triple (a,b,c) can be defined as (a, (b,c)), i.e., as one pair nested in another. This approach is mirrored in computer programming languages that enable constructing a list of elements by nesting cons cells. For example, the list (1 2 3 4 5) becomes (1, (2, (3, (4, (5, {}))))). The Lisp programming language employs such lists as its primary data structure.
The set of all ordered pairs whose first element is in some set X and whose second element is in some set Y is called the Cartesian product of X and Y, and written X×Y. A binary relation over the field X∪Y is a subset of X×Y.
If one wishes to employ the notation for a different purpose (such as denoting open intervals on the real number line) the ordered pair may be denoted by the variant notation
Defining the ordered pair using set theory
The above characteristic property of ordered pairs is all that is required to understand the role of ordered pairs in mathematics. Hence the ordered pair can be taken as a primitive notion, whose associated axiom is the characteristic property. This was the approach taken by the N. Bourbaki group in its Theory of Sets, published in 1954, long after Kuratowski discovered his reduction (below). The Kuratowski definition was added in the second edition of Theory of Sets, published in 1970.
If one agrees that set theory is an appealing foundation of mathematics, then all mathematical objects must be defined as sets of some sort. Hence if the ordered pair is not taken as primitive, it must be defined as a set.[1] Several set-theoretic definitions of the ordered pair are given below.
Wiener's definition
Norbert Wiener proposed the first set theoretical definition of the ordered pair in 1914 [2]:
He observed that this definition made it possible to define the types of Principia Mathematica as sets. Principia Mathematica had taken types, and hence relations of all arities, as primitive.
Hausdorff's definition
About the same time as Wiener (1914), Felix Hausdorff proposed his definition:
(a, b) := { {a, 1}, {b, 2} }
"where 1 and 2 are two distinct objects different from a and b" [3].
Kuratowski definition
In 1921 Kuratowski offered the now-accepted definition[4] of the ordered pair (a, b):
(a, b)K := {{a}, {a, b}}.
Note that this definition remains valid when the first and the second coordinate are identical, so that p = (x, x) = {{x}, {x, x}} = {{x}, {x}} = {{x}}.
Given some ordered pair p, the property "x is the first coordinate of p" can be formulated as:
The property "x is the second coordinate of p" can be formulated as:
In the case that the left and right coordinates are identical, the right conjunct is trivially true, since Y1 ≠ Y2 is never the case.
One may easily extract the first coordinate of a pair:
The second coordinate is harder to extract:
[edit]Variants
The above Kuratowski definition of the ordered pair is "adequate" in that it satisfies the characteristic property that an ordered pair must satisfy, namely that . There are other definitions, of similar or lesser complexity, that are equally adequate:
(a, b)reverse := {{b}, {a, b}};
(a, b)short := {a, {a, b}};
(a, b)01 := {{0,a}, {1, b}}.
reverse is merely a trivial variant of the Kuratowski definition, and as such is of no further interest. short is so-called because it requires two rather than three pairs of braces. Proving that short satisfies the characteristic property requires the ZFC axiom of regularity[5] Moreover, if one accepts the standard set-theoretic construction of the natural numbers, then 2 is defined as the set {0, 1} = {0, {0}}, which is indistinguishable from the pair (0, 0)short.
Proving that definitions satisfy the characteristic property
Prove: (a, b) = (c, d) if and only if a = c and b = d.
Kuratowski: If. If a = c and b = d, then {{a}, {a, b}} = {{c}, {c, d}}. Thus (a, b)K = (c, d)K.
Only if. Two cases: a = b, and a ≠ b.
If a = b:
(a, b)K = {{a}, {a, b}} = {{a}, {a, a}} = {{a}}.
(c, d)K = {{c}, {c, d}} = {{a}}.
Thus {c} = {c, d} = {a}, which implies a = c and a = d. By hypothesis, a = b. Hence b = d.
If a ≠ b, then (a, b)K = (c, d)K implies {{a}, {a, b}} = {{c}, {c, d}}.
Suppose {c, d} = {a}. Then c = d = a, and so {{c}, {c, d}} = {{a}, {a, a}} = {{a}, {a}} = {{a}}. But then {{a}, {a, b}} would also equal {{a}}, so that b = a which contradicts a ≠ b.
Suppose {c} = {a, b}. Then a = b = c, which also contradicts a ≠ b.
Therefore {c} = {a}, so that c = a and {c, d} = {a, b}.
If d = a were true, then {c, d} = {a, a} = {a} ≠ {a, b}, a contradiction. Thus d = b is the case, so that a = c and b = d.
If. If (a, b)reverse = (c, d)reverse, (b, a)K = (d, c)K. Therefore b = d and a = c.
Only if. If a = c and b = d, then {{b}, {a, b}} = {{d}, {c, d}}. Thus (a, b)reverse = (c, d)reverse.
Short:[6]
If: Obvious.
Only if: Suppose {a, {a, b}} = {c, {c, d}}. Then a is in the left hand side, and thus in the right hand side. Because equal sets have equal elements, one of a = c or a = {c, d} must be the case.
If a = {c, d}, then by similar reasoning as above, {a, b} is in the right hand side, so {a, b} = c or {a, b} = {c, d}.
If {a, b} = c then c is in {c, d} = a and a is in c, and this combination contradicts the axiom of regularity, as {a, c} has no minimal element under the relation "element of."
If {a, b} = {c, d}, then a is an element of a, from a = {c, d} = {a, b}, again contradicting regularity.
Hence a = c must hold.
Again, we see that {a, b} = c or {a, b} = {c, d}.
The option {a, b} = c and a = c implies that c is an element of c, contradicting regularity.
So we have a = c and {a, b} = {c, d}, and so: {b} = {a, b} \ {a} = {c, d} \ {c} = {d}, so b = d.
Quine-Rosser definition
Rosser (1953)[7] employed a definition of the ordered pair, due to Quine and requiring a prior definition of the natural numbers. Let be the set of natural numbers, and define
Applying this function simply increments every natural number in x. In particular, does not contain the number 0, so that for any sets x and y,
Define the ordered pair (A, B) as
Extracting all the elements of the pair that do not contain 0 and undoing yields A. Likewise, B can be recovered from the elements of the pair that do contain 0.
In type theory and in outgrowths thereof such as the axiomatic set theory NF, the Quine-Rosser pair has the same type as its projections and hence is termed a "type-level" ordered pair. Hence this definition has the advantage of enabling a function, defined as a set of ordered pairs, to have a type only 1 higher than the type of its arguments. This definition works only if the set of natural numbers is infinite. This is the case in NF, but not in type theory or in NFU. J. Barkley Rosser showed that the existence of such a type-level ordered pair (or even a "type-raising by 1" ordered pair) implies the axiom of infinity. For an extensive discussion of the ordered pair in the context of Quinian set theories, see Holmes (1998).[8]
Morse definition
Morse-Kelley set theory (Morse 1965)[9] makes free use of proper classes. Morse defined the ordered pair so that its projections could be proper classes as well as sets. (The Kuratowski definition does not allow this.) He first defined ordered pairs whose projections are sets in Kuratowski's manner. He then redefined the pair (x, y) as , where the component Cartesian products are Kuratowski pairs on sets. This second step renders possible pairs whose projections are proper classes. The Quine-Rosser definition above also admits proper classes as projections.
Category theory
A category-theoretic product A x B in a category of sets represents the set of ordered pairs, with the first element coming from A and the second coming from B. In this context the characteristic property above is a consequence of the universal property of the product and the fact that elements of a set X can be identified with morphisms from 1 (a one element set) to X. While different objects may have the universal property, they are all naturally isomorphic.
In mathematics, a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected objects are represented by mathematical abstractions called vertices, and the links that connect some pairs of vertices are called edges. Typically, a graph is depicted in diagrammatic form as a set of dots for the vertices, joined by lines or curves for the edges.
The edges may be directed (asymmetric) or undirected (symmetric). For example, if the vertices represent people at a party, and there is an edge between two people if they shake hands, then this is an undirected graph, because if person A shook hands with person B, then person B also shook hands with person A. On the other hand, if the vertices represent people at a party, and there is an edge from person A to person B when person A knows of person B, then this graph is directed, because knowing of someone is not necessarily a symmetric relation (that is, one person knowing of another person does not necessarily imply the reverse; for example, many fans may know of a celebrity, but the celebrity is unlikely to know of all their fans). This latter type of graph is called a directed graph and the edges are called directed edges or arcs; in contrast, a graph where the edges are not directed is called undirected.
Vertices are also called nodes or points, and edges are also called lines. Graphs are the basic subject studied by graph theory. The word "graph" was first used in this sense by James Joseph Sylvester in 1878
Definitions
Definitions in graph theory vary. The following are some of the more basic ways of defining graphs and related mathematical structures.
Graph
A general example of a graph (actually, a pseudograph) with three vertices and six edges.
In the most common sense of the term,[2] a graph is an ordered pair G = (V, E) comprising a set V of vertices or nodes together with a set E of edges or lines, which are 2-element subsets of V (i.e, an edge is related with two vertices, and the relation is represented as unordered pair of the vertices with respect to the particular edge). To avoid ambiguity, this type of graph may be described precisely as undirected and simple.
Other senses of graph stem from different conceptions of the edge set. In one more generalized notion,[3]E is a set together with a relation of incidence that associates with each edge two vertices. In another generalized notion, E is a multiset of unordered pairs of (not necessarily distinct) vertices. Many authors call this type of object a multigraph or pseudograph.
All of these variants and others are described more fully below.
The vertices belonging to an edge are called the ends, endpoints, or end vertices of the edge. A vertex may exist in a graph and not belong to an edge.
V and E are usually taken to be finite, and many of the well-known results are not true (or are rather different) for infinite graphs because many of the arguments fail in the infinite case. The order of a graph is | V | (the number of vertices). A graph's size is | E | , the number of edges. The degree of a vertex is the number of edges that connect to it, where an edge that connects to the vertex at both ends (a loop) is counted twice.
For an edge {u, v}, graph theorists usually use the somewhat shorter notation uv.
Adjacency relation
The edges E of an undirected graph G induce a symmetric binary relation ~ on V that is called the adjacency relation of G. Specifically, for each edge {u, v} the vertices u and v are said to be adjacent to one another, which is denoted u ~ v.
Types of graphs
Distinction in terms of the main definition
As stated above, in different contexts it may be useful to define the term graph with different degrees of generality. Whenever it is necessary to draw a strict distinction, the following terms are used. Most commonly, in modern texts in graph theory, unless stated otherwise, graph means "undirected simple finite graph" (see the definitions below).
Undirected graph
A graph in which edges have no orientation, i.e., they are not ordered pairs, but sets {u, v} (or 2-multisets) of vertices.
Directed graph
A directed graph.
Main article: Digraph (mathematics)
A directed graph or digraph is an ordered pair D = (V, A) with
V a set whose elements are called vertices or nodes, and
A a set of ordered pairs of vertices, called arcs, directed edges, or arrows.
An arc a = (x, y) is considered to be directed from xtoy; y is called the head and x is called the tail of the arc; y is said to be a direct successor of x, and x is said to be a direct predecessor of y. If a path leads from x to y, then y is said to be a successor of x and reachablex, and x is said to be a from predecessor of y. The arc (y, x) is called the arc (x, y) inverted.
A directed graph D is called symmetric if, for every arc in D, the corresponding inverted arc also belongs to D. A symmetric loopless directed graph D = (V, A) is equivalent to a simple undirected graph G = (V, E), where the pairs of inverse arcs in A correspond 1-to-1 with the edges in E; thus the edges in G number |E| = |A|/2, or half the number of arcs in D.
A variation on this definition is the oriented graph, in which not more than one of (x, y) and (y, x) may be arcs.
Mixed graph
A mixed graph G is a graph in which some edges may be directed and some may be undirected. It is written as an ordered triple G = (V, E, A) with V, E, and A defined as above. Directed and undirected graphs are special cases.
Multigraph
A loop is an edge (directed or undirected) which starts and ends on the same vertex; these may be permitted or not permitted according to the application. In this context, an edge with two different ends is called a link.
The term "multigraph" is generally understood to mean that multiple edges (and sometimes loops) are allowed. Where graphs are defined so as to allow loops and multiple edges, a multigraph is often defined to mean a graph without loops,[4] however, where graphs are defined so as to disallow loops and multiple edges, the term is often defined to mean a "graph" which can have both multiple edges and loops,[5] although many use the term "pseudograph" for this meaning.[6]
Simple graph
A simple graph with three vertices and three edges. Each vertex has degree two, so this is also a regular graph.
As opposed to a multigraph, a simple graph is an undirected graph that has no loops and no more than one edge between any two different vertices. In a simple graph the edges of the graph form a set (rather than a multiset) and each edge is a pair of distinctn vertices every vertex has a degree that is less than n (the converse, however, is not true - there exist non-simple graphs with nn). vertices. In a simple graph with vertices in which every vertex has a degree smaller than
Weighted graph
A graph is a weighted graph if a number (weight) is assigned to each edge. Such weights might represent, for example, costs, lengths or capacities, etc. depending on the problem.
The weight of the graph is sum of the weights given to all edges.
Half-edges, loose edges
In exceptional situations it is even necessary to have edges with only one end, called half-edges, or no ends (loose edges); see for example signed graphs and biased graphs.
Important graph classes
Regular graph
A regular graph is a graph where each vertex has the same number of neighbors, i.e., every vertex has the same degree or valency. A regular graph with vertices of degree k is called a k‑regular graph or regular graph of degree k.
Complete graph
A complete graph with 5 vertices. Each vertex has an edge to every other vertex.
Complete graphs have the feature that each pair of vertices has an edge connecting them.
Finite and infinite graphs
A finite graph is a graph G = (V, E) such that V and Efinite sets. An infinite graph is one with an infinite are set of vertices or edges or both.
Most commonly in graph theory it is implied that the graphs discussed are finite. If the graphs are infinite, that is usually specifically stated.
Graph classes in terms of connectivity
In an undirected graph G, two vertices u and v are called connected if G contains a path from u to v. Otherwise, they are called disconnected. A graph is called connected if every pair of distinct vertices in the graph is connected; otherwise, it is called disconnected.
A graph is called k-vertex-connected or k-edge-connected if removal some set of k or more vertices (respectively, edges) makes the graph disconnected. A k-vertex-connected graph is often called simply k-connected.
A directed graph is called weakly connected if replacing all of its directed edges with undirected edges produces a connected (undirected) graph. It is strongly connected or strong if it contains a directed path from uv and a directed path from v to u for every pair of vertices u, v. to
Properties of graphs
Two edges of a graph are called adjacent (sometimes coincident) if they share a common vertex. Two arrows of a directed graph are called consecutive if the head of the first one is at the nock (notch end) of the second one. Similarly, two vertices are called adjacent if they share a common edge (consecutive if they are at the notch and at the head of an arrow), in which case the common edge is said to join the two vertices. An edge and a vertex on that edge are called incident.
The graph with only one vertex and no edges is called the trivial graph. A graph with only vertices and no edges is known as an edgeless graph. The graph with no vertices and no edges is sometimes called the null graph or empty graph, but the terminology is not consistent and not all mathematicians allow this object.
In a weighted graph or digraph, each edge is associated with some value, variously called its cost, weight, length or other term depending on the application; such graphs arise in many contexts, for example in optimal routing problems such as the traveling salesman problem.
Normally, the vertices of a graph, by their nature as elements of a set, are distinguishable. This kind of graph may be called vertex-labeled. However, for many questions it is better to treat vertices as indistinguishable; then the graph may be called unlabeled. (Of course, the vertices may be still distinguishable by the properties of the graph itself, e.g., by the numbers of incident edges). The same remarks apply to edges, so that graphs which have labeled edges are called edge-labeled graphs. Graphs with labels attached to edges or vertices are more generally designated as labeled. Consequently, graphs in which vertices are indistinguishable and edges are indistinguishable are called unlabeled. (Note that in the literature the term labeled may apply to other kinds of labeling, besides that which serves only to distinguish different vertices or edges.)
Examples
A graph with six nodes.
The picture is a graphic representation of the following graph
The fact that vertex 1 is adjacent to vertex 2 is sometimes denoted by 1 ~ 2.
In category theory a small category can be considered a directed multigraph with the objects as vertices and the morphisms as directed edges. The functors between categories induce then some, but not necessarily all, of the digraph morphisms.
In computer science directed graphs are used to represent knowledge (e.g., Conceptual graph), finite state machines and many other discrete structures.
A binary relation R on a set X is a directed graph. Two elements x, y of X are connected by an arrow iff xRy.
Important graphs
Basic examples are:
In a complete graph each pair of vertices is joined by an edge, that is, the graph contains all possible edges.
In a bipartite graph, the vertices can be divided into two sets, W and X, so that every edge has one vertex in each of the two sets.
In a complete bipartite graph, the vertex set is the union of two disjoint subsets, W and X, so that every vertex in W is adjacent to every vertex in X but there are no edges within W or X.
In a linear graph or path graph of length n, the vertices can be listed in order, v0, v1, ..., vn, so that the edges are vi−1vi for each i = 1, 2, ..., n. If a linear graph occurs as a subgraph of another graph, it is a path in that graph.
In a cycle graph of length n vertices can be named v1, ..., vn with n at least 3, so that the edges are vi−1vi for each i = 2,...,n and vnv1. Cycle graphs can be characterized as connectedcycle or circuit in that graph. graphs with degree 2 at every vertex. If a cycle graph occurs as a subgraph of another graph, it defines a
A planar graph can be drawn in a plane with no crossing edges (i.e., embedded in a plane).
A tree is a connected graph with no cycles.
A forest is a graph with no cycles (i.e. one or more trees).
More advanced kinds of graphs are:
The Petersen graph and its generalizations
Perfect graphs
Cographs
Other graphs with large automorphism groups: vertex-transitive, arc-transitive, and distance-transitive graphs.
Strongly regular graphs and their generalization distance-regular graphs.
Operations on graphs
There are several operations that produce new graphs from old ones, which might be classified into the following categories:
Elementary operations, sometimes called "editing operations" on graphs, which create a new graph from the original one by a simple, local change, such as addition or deletion of a vertex or an edge, merging and splitting of vertices, etc.
Graph rewrite operations replacing the occurrence of some pattern graph within the host graph by an instance of the corresponding replacement graph.
Unary operations, which create a significantly new graph from the old one. Examples:
Line graph
Dual graph
Complement graph
Binary operations, which create new graph from two initial graphs. Examples:
Disjoint union of graphs
Cartesian product of graphs
Tensor product of graphs
Strong product of graphs
Lexicographic product of graphs
Generalizations
In a hypergraph, an edge can join more than two vertices.
An undirected graph can be seen as a simplicial complex consisting of 1-simplices (the edges) and 0-simplices (the vertices). As such, complexes are generalizations of graphs since they allow for higher-dimensional simplices.
Every graph gives rise to a matroid.
In model theory, a graph is just a structure. But in that case, there is no limitation on the number of edges: it can be any cardinal number, see continuous graph.
In computational biology, power graph analysis introduces power graphs as an alternative representation of undirected graphs.
In linear algebra an n-by-n (square) matrix A is called invertible or nonsingular or nondegenerate, if there exists an n-by-n matrix B such that
where In denotes the n-by-n identity matrix and the multiplication used is ordinary matrix multiplication. If this is the case, then the matrix B is uniquely determined by A and is called the inverse of A, denoted by A−1. It follows from the theory of matrices that if
for square matrices A and B, then also
[1]
Non-square matrices (m-by-n matrices for which m ≠ n) do not have an inverse. However, in some cases such a matrix may have a left inverse or right inverse. If A is m-by-n and the rank of A is equal to n, then A has a left inverse: an n-by-m matrix B such that BA = I. If A has rank m, then it has a right inverse: an n-by-m matrix B such that AB = I.
A square matrix that is not invertible is called singular or degenerate. A square matrix is singular if and only if its determinant is 0. Singular matrices are rare in the sense that if you pick a random square matrix, it will almost surely not be singular.
While the most common case is that of matrices over the real or complex numbers, all these definitions can be given for matrices over any commutative ring. However, in this case the condition for a square matrix to be invertible is that its determinant is invertible in the ring, which in general is a much stricter requirement than being nonzero.
Matrix inversion is the process of finding the matrix B that satisfies the prior equation for a given invertible matrix A.
Properties
Let A be a square n by n matrix over a field K (for example the field R of real numbers). Then the following statements are equivalent:
A is invertible.
A is row-equivalent to the n-by-n identity matrix In.
A is column-equivalent to the n-by-n identity matrix In.
A has n pivot positions.
det A ≠ 0. In general, a square matrix over a commutative ring is invertible if and only if its determinant is a unit in that ring.
rank A = n.
The equation Ax = 0 has only the trivial solution x = 0 (i.e., NullA = {0})
The equation Ax = b has exactly one solution for each b in Kn, (x ≠ 0).
The columns of A are linearly independent.
The columns of A span Kn (i.e. Col A = Kn).
The columns of A form a basis of Kn.
The linear transformation mapping x to Ax is a bijection from Kn to Kn.
There is an n by n matrix B such that AB = In.
There is an n by n matrix C such that CA = In.
The transpose AT is an invertible matrix (hence rows of A are linearly independent, span Kn, and form a basis of Kn).
The number 0 is not an eigenvalue of A.
The matrix A can be expressed as a finite product of elementary matrices.
Furthermore, the following properties hold for an invertible matrix A:
.
for nonzero scalar k
For any invertible n×n matrices A and B. More generally, if A1,...,Ak are invertible n×n matrices, then
A matrix that is its own inverse, i.e. and , is called an involution.
Density
Over the field of real numbers, the set of singular n-by-n matrices, considered as a subset of , is a null set, i.e., has Lebesgue measure zero. This is true because singular matrices are the roots of the polynomial function in the entries of the matrix given by the determinant. Thus in the language of measure theory, almost all n-by-n matrices are invertible.
Furthermore the n-by-n invertible matrices are a dense open set in the topological space of all n-by-n matrices. Equivalently, the set of singular matrices is closed and nowhere dense in the space of n-by-n matrices.
In practice however, one may encounter non-invertible matrices. And in numerical calculations, matrices which are invertible, but close to a non-invertible matrix, can still be problematic; such matrices are said to be ill-conditioned.
Methods of matrix inversion
Gaussian elimination
Gauss–Jordan elimination is an algorithm that can be used to determine whether a given matrix is invertible and to find the inverse. An alternative is the LU decomposition which generates an upper and a lower triangular matrices which are easier to invert. For special purposes, it may be convenient to invert matrices by treating mn-by-mn matrices as m-by-m matrices of n-by-n matrices, and applying one or another formula recursively (other sized matrices can be padded out with dummy rows and columns). For other purposes, a variant of Newton's method may be convenient (particularly when dealing with families of related matrices, so inverses of earlier matrices can be used to seed generating inverses of later matrices).
Analytic solution
Writing the transpose of the matrix of cofactors, known as an adjugate matrix, can also be an efficient way to calculate the inverse of small matrices, but this recursive method is inefficient for large matrices. To determine the inverse, we calculate a matrix of cofactors:
where |A| is the determinant of A, Cij is the matrix of cofactors, and AT represents the matrix transpose.
For most practical applications, it is not necessary to invert a matrix to solve a system of linear equations; however, for a unique solution, it is necessary that the matrix involved be invertible.
Decomposition techniques like LU decomposition are much faster than inversion, and various fast algorithms for special classes of linear systems have also been developed.
Inversion of 2×2 matrices
The cofactor equation listed above yields the following result for 2×2 matrices. Inversion of these matrices can be done easily as follows: [2]
This is possible because 1/(ad-bc) is the reciprocal of the determinant of the matrix in question, and the same strategy could be used for other matrix sizes.
Inversion of 3×3 matrices
A computationally efficient 3x3 matrix inversion is given by
where
Z = a(ek − fh) + b(fg − dk) + c(dh − eg)
which is the determinant of the matrix. If Z is finite (non-zero), the matrix is invertible, with the elements of the above matrix on the right side given by
The general 3×3 inverse can be expressed concisely in terms of the cross product and triple product,:
If a matrix (consisting of three column vectors, , , and ) is invertible, its inverse is given by
where is a column vector and is a row vector. Note that det(A) is equal to the triple product of , , and —the volume of the paralallelapiped formed by the rows or columns:
The correctness of the formula can be checked by using cross- and triple-product properties and by noting that for groups, left and right inverses always coincide. Intuitively, because of the cross products, each row of is orthogonal to the non-corresponding two columns of (causing the off-diagonal terms of be zero). Dividing by
causes the diagonal elements of to be unity. For example, the first diagonal is:
.
Blockwise inversion
Matrices can also be inverted blockwise by using the following analytic inversion formula:
where A, B, C and D are matrix sub-blocks of arbitrary size. (A and D must, of course, be square, so that they can be inverted. Furthermore, this is true if and only if A and D−CA−1B are nonsingular [3] ). This strategy is particularly advantageous if A is diagonal and D−CA−1B (the Schur complement of A) is a small matrix, since they are the only matrices requiring inversion. This technique was reinvented several times and is due to Hans Boltz (1923),[citation needed] who used it for the inversion of geodetic matrices, and Tadeusz Banachiewicz (1937), who generalized it and proved its correctness.
The nullity theorem says that the nullity of A equals the nullity of the sub-block in the lower right of the inverse matrix, and that the nullity of B equals the nullity of the sub-block in the upper right of the inverse matrix.
The inversion procedure that led to Equation (1) performed matrix block operations that operated on C and D first. Instead, if A and B are operated on first, and provided D and A−BD−1C are nonsingular [4] , the result is
Equating Equations (1) and (2) leads to
where Equation (3) is the matrix inversion lemma, which is equivalent to the binomial inverse theorem.
By Neumann series
If a matrix A has the property that
then A is nonsingular and its inverse may be expressed by a Neumann series:[5]
Truncating the sum results in an "approximate" inverse which may be useful as a preconditioner.
More generally, if A is "near" the invertible matrix X in the sense that
then A is nonsingular and its inverse is
If it is also the case that A-X has rank 1 then this simplifies to
Derivative of the matrix inverse
Suppose that the invertible matrix A depends on a parameter t. Then the derivative of the inverse of A with respect to t is given by
To derive the above expression for the derivative of the inverse of A, one can differentiate the definition of the matrix inverse and then solve for the inverse of A:
Subtracting from both sides of the above and multiplying on the right by gives the correct expression for the derivative of the inverse:
Similarly, if ε is a small number then
Moore–Penrose pseudoinverse
Some of the properties of inverse matrices are shared by Moore–Penrose pseudoinverses, which can be defined for any m-by-n matrix.
Matrix inverses in real-time simulations
Matrix inversion plays a significant role in computer graphics, particularly in 3D graphics rendering and 3D simulations. Examples include screen-to-world ray casting, world-to-subspace-to-world object transformations, and physical simulations.
In mathematics, a matrix (plural matrices, or less commonly matrixes) is a rectangular array of numbers, such as
An item in a matrix is called an entry or an element. The example has entries 1, 9, 13, 20, 55, and 4. Entries are often denoted by a variable with two subscripts, as shown on the right. Matrices of the same size can be added and subtracted entrywise and matrices of compatible sizes can be multiplied. These operations have many of the properties of ordinary arithmetic, except that matrix multiplication is not commutative, that is, AB and BA are not equal in general. Matrices consisting of only one column or row define the components of vectors, while higher-dimensional (e.g., three-dimensional) arrays of numbers define the components of a generalization of a vector called a tensor. Matrices with entries in other fields or rings are also studied.
Matrices are a key tool in linear algebra. One use of matrices is to represent linear transformations, which are higher-dimensional analogs of linear functions of the form f(x) = cx, where c is a constant; matrix multiplication corresponds to composition of linear transformations. Matrices can also keep track of the coefficients in a system of linear equations. For a square matrix, the determinant and inverse matrix (when it exists) govern the behavior of solutions to the corresponding system of linear equations, and eigenvalues and eigenvectors provide insight into the geometry of the associated linear transformation.
Matrices find many applications. Physics makes use of matrices in various domains, for example in geometrical optics and matrix mechanics; the latter led to studying in more detail matrices with an infinite number of rows and columns. Graph theory uses matrices to keep track of distances between pairs of vertices in a graph. Computer graphics uses matrices to project 3-dimensional space onto a 2-dimensional screen. Matrix calculus generalizes classical analytical notions such as derivatives of functions or exponentials to matrices. The latter is a recurring need in solving ordinary differential equations. Serialism and dodecaphonism are musical movements of the 20th century that use a square mathematical matrix to determine the pattern of music intervals.
A major branch of numerical analysis is devoted to the development of efficient algorithms for matrix computations, a subject that is centuries old but still an active area of research. Matrix decomposition methods simplify computations, both theoretically and practically. For sparse matrices, specifically tailored algorithms can provide speedups; such matrices arise in the finite element method, for example.
Definition
A matrix is a rectangular arrangement of numbers.[1] For example,
An alternative notation uses large parentheses instead of box brackets:
The horizontal and vertical lines in a matrix are called rows and columns, respectively. The numbers in the matrix are called its entries or its elements. To specify a matrix's size, a matrix with m rows and n columns is called an m-by-n matrix or m × n matrix, while m and n are called its dimensions. The above is a 4-by-3 matrix.
A matrix with one row (a 1 × n matrix) is called a row vector, and a matrix with one column (an m × 1 matrix) is called a column vector. Any row or column of a matrix determines a row or column vector, obtained by removing all other rows respectively columns from the matrix. For example, the row vector for the third row of the above matrix A is
When a row or column of a matrix is interpreted as a value, this refers to the corresponding row or column vector. For instance one may say that two different rows of a matrix are equal, meaning they determine the same row vector. In some cases the value of a row or column should be interpreted just as a sequence of values (an element of Rn if entries are real numbers) rather than as a matrix, for instance when saying that the rows of a matrix are equal to the corresponding columns of its transpose matrix.
Most of this article focuses on real and complex matrices, i.e., matrices whose entries are real or complex numbers. More general types of entries are discussed below.
Notation
The specifics of matrices notation varies widely, with some prevailing trends. Matrices are usually denoted using upper-case letters, while the corresponding lower-case letters, with two subscript indices, represent the entries. In addition to using upper-case letters to symbolize matrices, many authors use a special typographical style, commonly boldface upright (non-italic), to further distinguish matrices from other variables. An alternative notation involves the use of a double-underline with the variable name, with or without boldface style, (e.g., ).
The entry that lies in the i-th row and the j-th column of a matrix is typically referred to as the i,j, (i,j), or (i,j)th entry of the matrix. For example, the (2,3) entry of the above matrix A is 7. The (i, j)th entry of a matrix A is most commonly written as ai,j. Alternative notations for that entry are A[i,j] or Ai,j.
Sometimes a matrix is referred to by giving a formula for its (i,j)th entry, often with double parenthesis around the formula for the entry, for example, if the (i,j)th entry of A were given by aij, A would be denoted ((aij)).
An asterisk is commonly used to refer to whole rows or columns in a matrix. For example, ai,∗ refers to the ith row of A, and a∗,j refers to the jth column of A. The set of all m-by-n matrices is denoted (m, n).
A common shorthand is
A = [ai,j]i=1,...,m; j=1,...,n or more briefly A = [ai,j]m×n
to define an m × n matrix A. Usually the entries ai,j are defined separately for all integers 1 ≤ i ≤ m and 1 ≤ j ≤ n. They can however sometimes be given by one formula; for example the 3-by-4 matrix
can alternatively be specified by A = [i − j]i=1,2,3; j=1,...,4, or simply A = ((i-j)), where the size of the matrix is understood.
Some programming languages start the numbering of rows and columns at zero, in which case the entries of an m-by-n matrix are indexed by 0 ≤ i ≤ m − 1 and 0 ≤ j ≤ n − 1.[2] This article follows the more common convention in mathematical writing where enumeration starts from 1.
Basic operations
There are a number of operations that can be applied to modify matrices called matrix addition, scalar multiplication and transposition.[3] These form the basic techniques to deal with matrices.
Operation
Definition
Example
Addition
The sumA+B of two m-by-n matrices A and B is calculated entrywise:
(A + B)i,j = Ai,j + Bi,j, where 1 ≤ i ≤ m and 1 ≤ j ≤ n.
Scalar multiplication
The scalar multiplication cA of a matrix A and a number c (also called a scalar in the parlance of abstract algebra) is given by multiplying every entry of A by c:
(cA)i,j = c · Ai,j.
Transpose
The transpose of an m-by-n matrix A is the n-by-m matrix AT (also denoted Atr or tA) formed by turning rows into columns and vice versa:
(AT)i,j = Aj,i.
Familiar properties of numbers extend to these operations of matrices: for example, addition is commutative, i.e. the matrix sum does not depend on the order of the summands: A + B = B + A.[4] The transpose is compatible with addition and scalar multiplication, as expressed by (cA)T = c(AT) and (A + B)T = AT + BT. Finally, (AT)T = A.
Row operations are ways to change matrices. There are three types of row operations: row switching, that is interchanging two rows of a matrix, row multiplication, multiplying all entries of a row by a non-zero constant and finally row addition which means adding a multiple of a row to another row. These row operations are used in a number of ways including solving linear equations and finding inverses.
Matrix multiplication, linear equations and linear transformations
Schematic depiction of the matrix product AB of two matrices A and B.
Multiplication of two matrices is defined only if the number of columns of the left matrix is the same as the number of rows of the right matrix. If A is an m-by-n matrix and B is an n-by-p matrix, then their matrix productAB is the m-by-p matrix whose entries are given by dot-product of the corresponding row of A and the corresponding column of B:
where 1 ≤ i ≤ m and 1 ≤ j ≤ p.[5] For example (the underlined entry 1 in the product is calculated as the product 1 · 1 + 0 · 1 + 2 · 0 = 1):
Matrix multiplication satisfies the rules (AB)C = A(BC) (associativity), and (A+B)C = AC+BC as well as C(A+B) = CA+CB (left and right distributivity), whenever the size of the matrices is such that the various products are defined.[6] The product AB may be defined without BA being defined, namely if A and B are m-by-n and n-by-k matrices, respectively, and m ≠ k. Even if both products are defined, they need not be equal, i.e. generally one has
AB ≠ BA,
i.e., matrix multiplication is not commutative, in marked contrast to (rational, real, or complex) numbers whose product is independent of the order of the factors. An example of two matrices not commuting with each other is:
whereas
The identity matrix In of size n is the n-by-n matrix in which all the elements on the main diagonal are equal to 1 and all other elements are equal to 0, e.g.
It is called identity matrix because multiplication with it leaves a matrix unchanged: MIn = ImM = M for any m-by-n matrix M.
Besides the ordinary matrix multiplication just described, there exist other less frequently used operations on matrices that can be considered forms of multiplication, such as the Hadamard product and the Kronecker product.[7] They arise in solving matrix equations such as the Sylvester equation.
Linear equations
A particular case of matrix multiplication is tightly linked to linear equations: if x designates a column vector (i.e. n×1-matrix) of n variables x1, x2, ..., xn, and A is an m-by-n matrix, then the matrix equation
Ax = b,
where b is some m×1-column vector, is equivalent to the system of linear equations
A1,1x1 + A1,2x2 + ... + A1,nxn = b1
...
Am,1x1 + Am,2x2 + ... + Am,nxn = bm .[8]
This way, matrices can be used to compactly write and deal with multiple linear equations, i.e. systems of linear equations.
Linear transformations
Matrices and matrix multiplication reveal their essential features when related to linear transformations, also known as linear maps. A real m-by-n matrix A gives rise to a linear transformation Rn → Rm mapping each vector x in Rn to the (matrix) product Ax, which is a vector in Rm. Conversely, each linear transformation f: Rn → Rm arises from a unique m-by-n matrix A: explicitly, the (i, j)-entry of A is the ith coordinate of f(ej), where ej = (0,...,0,1,0,...,0) is the unit vector with 1 in the jth position and 0 elsewhere. The matrix A is said to represent the linear map f, and A is called the transformation matrix of f.
The following table shows a number of 2-by-2 matrices with the associated linear maps of R2. The blue original is mapped to the green grid and shapes, the origin (0,0) is marked with a black point.
Vertical shear with m=1.25.
Horizontal flip
Squeeze mapping with r=3/2
Scaling by a factor of 3/2
Rotation by π/6R = 30°
Under the 1-to-1 correspondence between matrices and linear maps, matrix multiplication corresponds to composition of maps:[9] if a k-by-m matrix B represents another linear map g : Rm → Rk, then the composition g ∘ f is represented by BA since
(g ∘ f)(x) = g(f(x)) = g(Ax) = B(Ax) = (BA)x.
The last equality follows from the above-mentioned associativity of matrix multiplication.
The rank of a matrixA is the maximum number of linearly independent row vectors of the matrix, which is the same as the maximum number of linearly independent column vectors.[10] Equivalently it is the dimension of the image of the linear map represented by A.[11] The rank-nullity theorem states that the dimension of the kernel of a matrix plus the rank equals the number of columns of the matrix.[12]
Square matrices
A square matrix is a matrix which has the same number of rows and columns. An n-by-n matrix is known as a square matrix of order n. Any two square matrices of the same order can be added and multiplied. A square matrix A is called invertible or non-singular if there exists a matrix B such that
AB = In.[13]
This is equivalent to BA = In.[14] Moreover, if B exists, it is unique and is called the inverse matrix of A, denoted A−1.
The entries Ai,i form the main diagonal of a matrix. The trace, tr(A) of a square matrix A is the sum of its diagonal entries. While, as mentioned above, matrix multiplication is not commutative, the trace of the product of two matrices is independent of the order of the factors: tr(AB) = tr(BA).[15]
If all entries outside the main diagonal are zero, A is called a diagonal matrix. If only all entries above (below) the main diagonal are zero, A is called a lower triangular matrix (upper triangular matrix, respectively). For example, if n = 3, they look like
(diagonal), (lower) and (upper triangular matrix).
Determinant
A linear transformation on R2 given by the indicated matrix. The determinant of this matrix is −1, as the area of the green parallelogram at the right is 1, but the map reverses the orientation, since it turns the counterclockwise orientation of the vectors to a clockwise one.
The determinant det(A) or |A| of a square matrix A is a number encoding certain properties of the matrix. A matrix is invertible if and only if its determinant is nonzero. Its absolute value equals the area (in R2) or volume (in R3) of the image of the unit square (or cube), while its sign corresponds to the orientation of the corresponding linear map: the determinant is positive if and only if the orientation is preserved.
The determinant of 2-by-2 matrices is given by
the determinant of 3-by-3 matrices involves 6 terms (rule of Sarrus). The more lengthy Leibniz formula generalises these two formulae to all dimensions.[16]
The determinant of a product of square matrices equals the product of their determinants: det(AB) = det(A) · det(B).[17] Adding a multiple of any row to another row, or a multiple of any column to another column, does not change the determinant. Interchanging two rows or two columns affects the determinant by multiplying it by −1.[18] Using these operations, any matrix can be transformed to a lower (or upper) triangular matrix, and for such matrices the determinant equals the product of the entries on the main diagonal; this provides a method to calculate the determinant of any matrix. Finally, the Laplace expansion expresses the determinant in terms of minors, i.e., determinants of smaller matrices.[19] This expansion can be used for a recursive definition of determinants (taking as starting case the determinant of a 1-by-1 matrix, which is its unique entry, or even the determinant of a 0-by-0 matrix, which is 1), that can be seen to be equivalent to the Leibniz formula. Determinants can be used to solve linear systems using Cramer's rule, where the division of the determinants of two related square matrices equates to the value of each of the system's variables.[20]
Eigenvalues and eigenvectors
A number λ and a non-zero vector v satisfying
Av = λv
are called an eigenvalue and an eigenvector of A, respectively.[nb 1][21] The number λ is an eigenvalue of an n×n-matrix A if and only if A−λIn is not invertible, which is equivalent to
[22]
The function pA(t) = det(A−tI) is called the characteristic polynomial of A, its degree is n. Therefore pA(t) has at most n different roots, i.e., eigenvalues of the matrix.[23] They may be complex even if the entries of A are real. According to the Cayley-Hamilton theorem, pA(A) = 0, that is to say, the characteristic polynomial applied to the matrix itself yields the zero matrix.
Symmetry
A square matrix A that is equal to its transpose, i.e. A = AT, is a symmetric matrix; if it is equal to the negative of its transpose, i.e. A = −AT, then it is a skew-symmetric matrix. In complex matrices, symmetry is often replaced by the concept of Hermitian matrices, which satisfy A∗ = A, where the star or asterisk denotes the conjugate transpose of the matrix, i.e. the transpose of the complex conjugate of A.
By the spectral theorem, real symmetric matrices and complex Hermitian matrices have an eigenbasis; i.e., every vector is expressible as a linear combination of eigenvectors. In both cases, all eigenvalues are real.[24] This theorem can be generalized to infinite-dimensional situations related to matrices with infinitely many rows and columns, see below.
Definiteness
Matrix A; definiteness; associated quadratic form QA(x,y); set of vectors (x,y) such that QA(x,y)=1
positive definite
indefinite
1/4 x2 + y2
1/4 x2 − 1/4 y2
Ellipse
Hyperbola
A symmetric n×n-matrix is called positive-definite (respectively negative-definite; indefinite), if for all nonzero vectors x ∈ Rn the associated quadratic form given by
Q(x) = xTAx
takes only positive values (respectively only negative values; both some negative and some positive values).[25] If the quadratic form takes only non-negative (respectively only non-positive) values, the symmetric matrix is called positive-semidefinite (respectively negative-semidefinite); hence the matrix is indefinite precisely when it is neither positive-semidefinite nor negative-semidefinite.
A symmetric matrix is positive-definite if and only if all its eigenvalues are positive.[26] The table at the right shows two possibilities for 2-by-2 matrices.
Allowing as input two different vectors instead yields the bilinear form associated to A:
BA (x, y) = xTAy.[27]
Computational aspects
In addition to theoretical knowledge of properties of matrices and their relation to other fields, it is important for practical purposes to perform matrix calculations effectively and precisely. The domain studying these matters is called numerical linear algebra.[28] As with other numerical situations, two main aspects are the complexity of algorithms and their numerical stability. Many problems can be solved by both direct algorithms or iterative approaches. For example, finding eigenvectors can be done by finding a sequence of vectors xn converging to an eigenvector when n tends to infinity.[29]
Determining the complexity of an algorithm means finding upper bounds or estimates of how many elementary operations such as additions and multiplications of scalars are necessary to perform some algorithm, e.g. multiplication of matrices. For example, calculating the matrix product of two n-by-n matrix using the definition given above needs n3 multiplications, since for any of the n2 entries of the product, n multiplications are necessary. The Strassen algorithm outperforms this "naive" algorithm; it needs only n2.807 multiplications.[30] A refined approach also incorporates specific features of the computing devices.
In many practical situations additional information about the matrices involved is known. An important case are sparse matrices, i.e. matrices most of whose entries are zero. There are specifically adapted algorithms for, say, solving linear systems Ax = b for sparse matrices A, such as the conjugate gradient method.[31]
An algorithm is, roughly speaking, numerically stable, if little deviations (such as rounding errors) do not lead to big deviations in the result. For example, calculating the inverse of a matrix via Laplace's formula (Adj (A) denotes the adjugate matrix of A)
A−1 = Adj(A) / det(A)
may lead to significant rounding errors if the determinant of the matrix is very small. The norm of a matrix can be used to capture the conditioning of linear algebraic problems, such as computing a matrix' inverse.[32]
Although most computer languages are not designed with commands or libraries for matrices, as early as the 1970s, some engineering desktop computers such as the HP 9830 had ROM cartridges to add BASIC commands for matrices. Some computer languages such as APL were designed to manipulate matrices, and various mathematical programs can be used to aid computing with matrices.[33]
Matrix decomposition methods
There are several methods to render matrices into a more easily accessible form. They are generally referred to as matrix transformation or matrix decomposition techniques. The interest of all these decomposition techniques is that they preserve certain properties of the matrices in question, such as determinant, rank or inverse, so that these quantities can be calculated after applying the transformation, or that certain matrix operations are algorithmically easier to carry out for some types of matrices.
The LU decomposition factors matrices as a product of lower (L) and an upper triangular matrices (U).[34] Once this decomposition is calculated, linear systems can be solved more efficiently, by a simple technique called forward and back substitution. Likewise, inverses of triangular matrices are algorithmically easier to calculate. The Gaussian elimination is a similar algorithm; it transforms any matrix to row echelon form.[35] Both methods proceed by multiplying the matrix by suitable elementary matrices, which correspond to permuting rows or columns and adding multiples of one row to another row. Singular value decomposition expresses any matrix A as a product UDV∗, where U and V are unitary matrices and D is a diagonal matrix.
A matrix in Jordan normal form. The grey blocks are called Jordan blocks.
The eigendecomposition or diagonalization expresses A as a product VDV−1, where D is a diagonal matrix and V is a suitable invertible matrix.[36] If A can be written in this form, it is called diagonalizable. More generally, and applicable to all matrices, the Jordan decomposition transforms a matrix into Jordan normal form, that is to say matrices whose only nonzero entries are the eigenvalues λ1 to λn of A, placed on the main diagonal and possibly entries equal to one directly above the main diagonal, as shown at the right.[37] Given the eigendecomposition, the nth power of A (i.e. n-fold iterated matrix multiplication) can be calculated via
An = (VDV−1)n = VDV−1VDV−1...VDV−1 = VDnV−1
and the power of a diagonal matrix can be calculated by taking the corresponding powers of the diagonal entries, which is much easier than doing the exponentiation for A instead. This can be used to compute the matrix exponential eA, a need frequently arising in solving linear differential equations, matrix logarithms and square roots of matrices.[38] To avoid numerically ill-conditioned situations, further algorithms such as the Schur decomposition can be employed.[39]
Abstract algebraic aspects and generalizations
Matrices can be generalized in different ways. Abstract algebra uses matrices with entries in more general fields or even rings, while linear algebra codifies properties of matrices in the notion of linear maps. It is possible to consider matrices with infinitely many columns and rows. Another extension are tensors, which can be seen as higher-dimensional arrays of numbers, as opposed to vectors, which can often be realised as sequences of numbers, while matrices are rectangular or two-dimensional array of numbers.[40] Matrices, subject to certain requirements tend to form groups known as matrix groups.
Matrices with more general entries
This article focuses on matrices whose entries are real or complex numbers. However, matrices can be considered with much more general types of entries than real or complex numbers. As a first step of generalization, any field, i.e. a set where addition, subtraction, multiplication and division operations are defined and well-behaved, may be used instead of R or C, for example rational numbers or finite fields. For example, coding theory makes use of matrices over finite fields. Wherever eigenvalues are considered, as these are roots of a polynomial they may exist only in a larger field than that of the coefficients of the matrix; for instance they may be complex in case of a matrix with real entries. The possibility to reinterpret the entries of a matrix as elements of a larger field (e.g., to view a real matrix as a complex matrix whose entries happen to be all real) then allows considering each square matrix to possess a full set of eigenvalues. Alternatively one can consider only matrices with entries in an algebraically closed field, such as C, from the outset.
More generally, abstract algebra makes great use of matrices with entries in a ring R.[41] Rings are a more general notion than fields in that no division operation exists. The very same addition and multiplication operations of matrices extend to this setting, too. The set M(n, R) of all square n-by-n matrices over R is a ring called matrix ring, isomorphic to the endomorphism ring of the left R-module Rn.[42] If the ring R is commutative, i.e., its multiplication is commutative, then M(n, R) is a unitary noncommutative (unless n = 1) associative algebra over R. The determinant of square matrices over a commutative ring R can still be defined using the Leibniz formula; such a matrix is invertible if and only if its determinant is invertible in R, generalising the situation over a field F, where every nonzero element is invertible.[43] Matrices over superrings are called supermatrices.[44]
Matrices do not always have all their entries in the same ring - or even in any ring at all. One special but common case is block matrices, which may be considered as matrices whose entries themselves are matrices. The entries need not be quadratic matrices, and thus need not be members of any ordinary ring; but their sizes must fulfil certain compatibility conditions.
Relationship to linear maps
Linear maps Rn → Rm are equivalent to m-by-n matrices, as described above. More generally, any linear map f: V → W between finite-dimensional vector spaces can be described by a matrix A = (aij), after choosing bases v1, ..., vn of V, and w1, ..., wm of W (so n is the dimension of V and m is the dimension of W), which is such that
In other words, column j of A expresses the image of vj in terms of the basis vectors wi of W; thus this relation uniquely determines the entries of the matrix A. Note that the matrix depends on the choice of the bases: different choices of bases give rise to different, but equivalent matrices.[45] Many of the above concrete notions can be reinterpreted in this light, for example, the transpose matrix AT describes the transpose of the linear map given by A, with respect to the dual bases.[46]
Matrix groups
A group is a mathematical structure consisting of a set of objects together with a binary operation, i.e. an operation combining any two objects to a third, subject to certain requirements.[47] A group in which the objects are matrices and the group operation is matrix multiplication is called a matrix group.[nb 2][48] Since in a group every element has to be invertible, the most general matrix groups are the groups of all invertible matrices of a given size, called the general linear groups.
Any property of matrices that is preserved under matrix products and inverses can be used to define further matrix groups. For example, matrices with a given size and with a determinant of 1 form a subgroup of (i.e. a smaller group contained in) their general linear group, called a special linear group.[49] Orthogonal matrices, determined by the condition
MTM = I,
form the orthogonal group.[50] They are called orthogonal since the associated linear transformations of Rn preserve angles in the sense that the scalar product of two vectors is unchanged after applying M to them:
(Mv) · (Mw) = v · w.[51]
Every finite group is isomorphic to a matrix group, as one can see by considering the regular representation of the symmetric group.[52] General groups can be studied using matrix groups, which are comparatively well-understood, by means of representation theory.[53]
Infinite matrices
It is also possible to consider matrices with infinitely many rows and/or columns[54] even if, being infinite objects, one cannot write down such matrices explicitly. All that matters is that for every element in the set indexing rows, and every element in the set indexing columns, there is a well-defined entry (these index sets need not even be subsets of the natural numbers). The basic operations of addition, subtraction, scalar multiplication and transposition can still be defined without problem; however matrix multiplication may involve infinite summations to define the resulting entries, and these are not defined in general.
If infinite matrices are used to describe linear maps, then only those matrices can be used all of whose columns have but a finite number of nonzero entries, for the following reason. For a matrix A to describe a linear map f: V→W, bases for both spaces must have been chosen; recall that by definition this means that every vector in the space can be written uniquely as a (finite) linear combination of basis vectors, so that written as a (column) vector v of coefficients, only finitely many entries vi are nonzero. Now the columns of A describe the images by f of individual basis vectors of V in the basis of W, which is only meaningful if these columns have only finitely many nonzero entries. There is no restriction on the rows of A however: in the product A·v there are only finitely many nonzero coefficients of v involved, so every one of its entries, even if it is given as an infinite sum of products, involves only finitely many nonzero terms and is therefore well defined. Moreover this amounts to forming a linear combination of the columns of A that effectively involves only finitely many of them, whence the result has only finitely many nonzero entries, because each of those columns do. One also sees that products of two matrices of the given type is well defined (provided as usual that the column-index and row-index sets match), is again of the same type, and corresponds to the composition of linear maps.
Infinite matrices can also be used to describe operators on Hilbert spaces, where convergence and continuity questions arise, which again results in certain constraints that have to be imposed. However, the explicit point of view of matrices tends to obfuscate the matter,[nb 3] and the abstract and more powerful tools of functional analysis can be used instead.
Empty matrices
An empty matrix is a matrix in which the number of rows or columns (or both) is zero.[55][56] An empty matrix has no entries but it still has a well defined number of rows and columns, which are needed for instance in the definition of the matrix product. Thus if A is the 3-by-0 matrix A and B is the 0-by-3 matrix B, then AB is the 3-by-3 zero matrix (corresponding to the null map from a 3-dimensional space V to itself obtained obtained as composition of the unique map f from V to a 0-dimensional space Z, followed by the zero map g from Z back to V), while BA is the 0-by-0 matrix (corresponding to the unique map from Z to itself obtained as composition ). There is no common notation for empty matrices but most computer algebra systems will allow creating them and computing with them. Note that the determinant of the 0-by-0 matrix is 1 (and not 0 as might seem more natural): the Leibniz formula produces this value as a sum over the unique permutation of the empty set, with an empty product as term; also the Laplace expansion for a 1-by-1 matrix makes clear that the value of the 0-by-0 minor should be taken to be 1. This value is also consistent with the fact that the identity map from any finite dimensional space to itself has determinant 1, a fact that is often used as a part of the characterization of determinants.
Multiplication (symbol "×") is the mathematical operation of scaling one number by another. It is one of the four basic operations in elementary arithmetic (the others being addition, subtraction and division).
Because the result of scaling by whole numbers can be thought of as consisting of some number of copies of the original, whole-number products greater than 1 can be computed by repeated addition; for example, 3 multiplied by 4 (often said as "3 times 4") can be calculated by adding 4 copies of 3 together:
Here 3 and 4 are the "factors" and 12 is the "product".
There are differences amongst educationalists which number should normally be considered as the number of copies or whether multiplication should even be introduced as repeated addition.[1]
Multiplication of rational numbers (fractions) and real numbers is defined by systematic generalization of this basic idea.
Multiplication can also be visualized as counting objects arranged in a rectangle (for whole numbers) or as finding the area of a rectangle whose sides have given lengths (for numbers generally). The area of a rectangle does not depend on which side is measured first which illustrates that the order in which numbers are multiplied together does not matter.
In general the result of multiplying two measurements gives a result of a new type depending on the measurements. For instance:
The inverse of multiplication is division: as 3 times 4 is equal to 12, so 12 divided by 3 is equal to 4.
Multiplication is generalized further to other types of numbers (such as complex numbers) and to more abstract constructs such as matrices. For these more abstract constructs, the order in which the operands are multiplied sometimes does matter.
Notation and terminology
The multiplication sign.
Multiplication is often written using the multiplication sign "×" between the terms; that is, in infix notation. The result is expressed with an equals sign. For example,
(verbally, "two times three equals six")
There are several other common notations for multiplication:
Multiplication is sometimes denoted by either a middle dot or a period:
The middle dot is standard in the United States, the United Kingdom, and other countries where the period is used as a decimal point. In other countries that use a comma as a decimal point, either the period or a middle dot is used for multiplication.
The asterisk (as in 5*2) is often used in programming languages because it appears on every keyboard and is easier to see on older monitors[citation needed]. This usage originated in the FORTRAN programming language.
In algebra, multiplication involving variables is often written as a juxtaposition (e.g. xy for x times y or 5x for five times x). This notation can also be used for quantities that are surrounded by parentheses (e.g. 5(2) or (5)(2) for five times two).
In matrix multiplication, there is actually a distinction between the cross and the dot symbols. The cross symbol generally denotes a vector multiplication, while the dot denotes a scalar multiplication. A like convention distinguishes between the cross product and the dot product of two vectors.
The numbers to be multiplied are generally called the "factors" or "multiplicands". When thinking of multiplication as repeated addition, the number to be multiplied is called the "multiplicand", while the number of multiples is called the "multiplier". In algebra, a number that is the multiplier of a variable or expression (e.g. the 3 in 3xy2) is called a coefficient.
The result of a multiplication is called a product, and is a multiple of each factor that is an integer. For example 15 is the product of 3 and 5, and is both a multiple of 3 and a multiple of 5.
Computation
The common methods for multiplying numbers using pencil and paper require a multiplication table of memorized or consulted products of small numbers (typically any two numbers from 0 to 9), however one method, the peasant multiplication algorithm, does not.
Multiplying numbers to more than a couple of decimal places by hand is tedious and error prone. Common logarithms were invented to simplify such calculations. The slide rule allowed numbers to be quickly multiplied to about three places of accuracy. Beginning in the early twentieth century, mechanical calculators, such as the Marchant, automated multiplication of up to 10 digit numbers. Modern electronic computers and calculators have greatly reduced the need for multiplication by hand.
Properties
For the natural numbers, integers, fractions, and real and complex numbers, multiplication has certain properties:
Commutative property
The order in which two numbers are multiplied does not matter:
.
Associative property
Expressions solely involving multiplication are invariant with respect to order of operations:
Distributive property
Holds with respect to addition over multiplication. This identity is of prime importance in simplifying algebraic expressions:
Identity element
The multiplicative identity is 1; anything multiplied by one is itself. This is known as the identity property:
Zero element
Anything multiplied by zero is zero. This is known as the zero property of multiplication:
Zero is sometimes not included amongst the natural numbers.
There are a number of further properties of multiplication not satisfied by all types of numbers.
Negation
Negative one times any number is equal to the opposite of that number.
Negative one times negative one is positive one.
The natural numbers do not include negative numbers.
Inverse element
Every number x, except zero, has a multiplicative inverse, , such that .
The natural numbers and integers do not have inverses.
Order preservation
Multiplication by a positive number preserves order: if a > 0, then if b > c then ab > ac. Multiplication by a negative number reverses order: if a < 0 and b > c then ab < ac.
The complex numbers do not have an order predicate.
Other mathematical systems that include a multiplication operation may not have all these properties. For example, multiplication is not, in general, commutative for matrices and quaternions.
Axioms
In the book Arithmetices principia, nova methodo exposita, Giuseppe Peano proposed axioms for arithmetic based on his axioms for natural numbers.[3] Peano arithmetic has two axioms for multiplication:
Here S(y) represents the successor of y, or the natural number which followsy. The various properties like associativity can be proved from these and the other axioms of Peano arithmetic including induction. For instance S(0). denoted by 1, is a multiplicative identity because
The axioms for integers typically define them as equivalence classes of ordered pairs of natural numbers. The model is based on treating (x,y) as equivalent to x−y when x and y are treated as integers. Thus both (0,1) and (1,2) are equivalent to −1. The multiplication axiom for integers defined this way is
The rule that −1×−1 = 1 can then be deduced from
Multiplication is extended in a similar way to rational numbers and then to real numbers.
Multiplication with set theory
It is possible, though difficult, to create a recursive definition of multiplication with set theory. Such a system usually relies on the Peano definition of multiplication.
Cartesian product
The definition of multiplication as repeated addition provides a way to arrive at a set-theoretic interpretation of multiplication of cardinal numbers. In the expression
if the n copies of a are to be combined in disjoint union then clearly they must be made disjoint; an obvious way to do this is to use either a or n as the indexing set for the other. Then, the members of are exactly those of the Cartesian product . The properties of the multiplicative operation as applying to natural numbers then follow trivially from the corresponding properties of the Cartesian product.
Multiplication in group theory
There are many sets that, under the operation of multiplication, satisfy the axioms that define group structure. These axioms are closure, associativity, and the inclusion of an identity element and inverses.
A simple example is the set of non-zero rational numbers. Here we have identity 1, as opposed to groups under addition where the identity is typically 0. Note that with the rationals, we must exclude zero because, under multiplication, it does not have an inverse: there is no rational number that can be multiplied by zero to result in 1. In this example we have an abelian group, but that is not always the case.
To see this, look at the set of invertible square matrices of a given dimension, over a given field. Now it is straightforward to verify closure, associativity, and inclusion of identity (the identity matrix) and inverses. However, matrix multiplication is not commutative, therefore this group is nonabelian.
Another fact of note is that the integers under multiplication is not a group, even if we exclude zero. This is easily seen by the nonexistence of an inverse for all elements other than 1 and -1.
Multiplication in group theory is typically notated either by a dot, or by juxtaposition (the omission of an operation symbol between elements). So multiplying element a by element b could be notated ab or ab. When referring to a group via the indication of the set and operation, the dot is used, e.g. our first example could be indicated by
Multiplication of different kinds of numbers
Numbers can count (3 apples), order (the 3rd apple), or measure (3.5 feet high); as the history of mathematics has progressed from counting on our fingers to modelling quantum mechanics, multiplication has been generalized to more complicated and abstract types of numbers, and to things that are not numbers (such as matrices) or do not look much like numbers (such as quaternions).
Integers
is the sum of M copies of N when N and M are positive whole numbers. This gives the number of things in an array N wide and M high. Generalization to negative numbers can be done by and . The same sign rules apply to rational and real numbers.
Rational numbers
Generalization to fractions is by multiplying the numerators and denominators respectively: . This gives the area of a rectangle high and wide, and is the same as the number of things in an array when the rational numbers happen to be whole numbers.
Real numbers
(x)(y) is the limit of the products of the corresponding terms in certain sequences of rationals that converge to x and y, respectively, and is significant in calculus. This gives the area of a rectangle x high and y wide. See Products of sequences, above.
Complex numbers
Considering complex numbers z1 and z2 as ordered pairs of real numbers (a1,b1) and (a2,b2), the product is . This is the same as for reals, , when the imaginary partsb1 and b2 are zero.
Further generalizations
See Multiplication in group theory, above, and Multiplicative Group, which for example includes matrix multiplication. A very general, and abstract, concept of multiplication is as the "multiplicatively denoted" (second) binary operation in a ring. An example of a ring which is not any of the above number systems is a polynomial ring (you can add and multiply polynomials, but polynomials are not numbers in any usual sense.)
Division
Often division, , is the same as multiplication by an inverse, . Multiplication for some types of "numbers" may have corresponding division, without inverses; in an integral domain x may have no inverse "" but may be defined. In a division ring there are inverses but they are not commutative (since is not the same as , may be ambiguous).
Exponentiation
When multiplication is repeated, the resulting operation is known as exponentiation. For instance, the product 2×2×2 of three factors of two is "two raised to the third power", and is denoted by 23, a two with a superscript three. In this example, the number two is the base, and three is the exponent. In general, the exponent (or superscript) indicates how many times to multiply base by itself, so that the expression
indicates that the base a to be multiplied by itself n times.
In mathematics, a set is said to be closed under some operation if performance of that operation on members of the set always produces a member of the set. For example, the real numbers are closed under subtraction, but the natural numbers are not: 3 and 7 are both natural numbers, but the result of 3 − 7 is not.
Similarly, a set is said to be closed under a collection of operations if it is closed under each of the operations individually.
A set that is closed under an operation or collection of operations is said to satisfy a closure property. Often a closure property is introduced as an axiom, which is then usually called the axiom of closure. Note that modern set-theoretic definitions usually define operations as maps between sets, so adding closure to a structure as an axiom is superfluous, though it still makes sense to ask whether subsets are closed. For example, the set of real numbers is closed under subtraction, where (as mentioned above) its subset of natural numbers is not.
When a set S is not closed under some operations, one can usually find the smallest set containing S that is closed. This smallest closed set is called the closure of S (with respect to these operations). For example, the closure under subtraction of the set of natural numbers, viewed as a subset of the real numbers, is the set of integers. An important example is that of topological closure. The notion of closure is generalized by Galois connection, and further by monads.
Note that the set S must be a subset of a closed set in order for the closure operator to be defined. In the preceding example, it is important that the reals are closed under subtraction; in the domain of the natural numbers subtraction is not always defined.
The two uses of the word "closure" should not be confused. The former usage refers to the property of being closed, and the latter refers to the smallest closed set containing one that isn't closed. In short, the closure of a set satisfies a closure property.
Closed sets
A set is closed under an operation if that operation returns a member of the set when evaluated on members of the set. Sometimes the requirement that the operation be valued in a set is explicitly stated, in which case it is known as the axiom of closure. For example, one may define a group as a set with a binary product operator obeying several axioms, including an axiom that the product of any two elements of the group is again an element. However the modern definition of an operation makes this axiom superfluous; an n-ary operator on S is just a subset of Sn+1. By its very definition, an operator on a set cannot have values outside the set.
Nevertheless, the closure property of an operator on a set still has some utility. Closure on a set does not necessarily imply closure on all subsets. Thus a subgroup of a group is a subset on which the binary product and the unary operation of inversion satisfy the closure axiom.
An operation of a different sort is that of finding the limit points of a subset of a topological space (if the space is first-countable, it suffices to restrict consideration to the limits of sequences but in general one must consider at least limits of nets). A set that is closed under this operation is usually just referred to as a closed set in the context of topology. Without any further qualification, the phrase usually means closed in this sense. Closed intervals like [1,2] = {x: 1 ≤ x ≤ 2} are closed in this sense.
A partially ordered set is downward closed (and also called a lower set) if for every element of the set all smaller elements are also in it; this applies for example for the real intervals (-∞, p) and (-∞, p], and for an ordinal number p represented as interval [ 0, p); every downward closed set of ordinal numbers is itself an ordinal number.
Upward closed and upper set are defined similarly.
P closures of binary relations
The notion of a closure can be generalized for an arbitrary binary relation R ⊆ S×S, and an arbitrary property P in the following way: the P closure of R is the least relation Q ⊆ S×S that contains R (i.e. R ⊆ Q) and for which property P holds (i.e. P(Q) is true). For instance, one can define the symmetric closure as the least symmetric relation containing R. This generalization is often encountered in the theory of rewriting systems, where one often uses more "wordy" notions such as the reflexive transitive closureR*—the smallest preorder containing R, or the reflexive transitive symmetric closureR≡—the smallest equivalence relation containing R. For arbitrary P and R, the P closure of R need not exist. In the above examples, these exist because reflexivity, transitivity and symmetry are closed under arbitrary intersections. In such cases, the P closure can be directly defined as the intersection of all sets with property P containing R.[1]
Closure operator
Given an operation on a set X, one can define the closure C(S) of a subset S in X to be the smallest subset closed under that operation that contains S as a subset. For example, the closure of a subset of a group is the subgroup generated by that set.
The closure of sets with respect to some operation defines a closure operator on the subsets of X. The closed sets can be determined from the closure operator; a set is closed if it is equal to its own closure. Typical structural properties of all closure operations are:
The closure is increasing or extensive: the closure of an object contains the object.
The closure is idempotent: the closure of the closure equals the closure.
The closure is monotone, that is, if X is contained in Y, then also C(X) is contained in C(Y).
An object that is its own closure is called closed. By idempotency, an object is closed if and only if it is the closure of some object.
These three properties define an abstract closure operator. Typically, an abstract closure acts on the class of all subsets of a set.
Examples
In topology and related branches, the relevant operation is taking limits. The topological closure of a set is the corresponding closure operator. The Kuratowski closure axioms characterize this operator.
In linear algebra, the linear span of a set X of vectors is the closure of that set; it is the smallest subset of the vector space that includes X and is closed under the operation of linear combination. This subset is a subspace.
In matroid theory, the closure of X is the largest superset of X that has the same rank as X.
In set theory, the transitive closure of a binary relation.
In algebra, the algebraic closure of a field.
In commutative algebra, closure operations for ideals, as integral closure and tight closure.
In geometry, the convex hull of a set S of points is the smallest convex set of which S is a subset.
In the theory of formal languages, the Kleene closure of a language can be described as the set of strings that can be made by concatenating zero or more strings from that language.
In group theory, the normal closure of a set of group elements is the smallest normal subgroup containing the set.
A vinculum is a horizontal line placed over a mathematical expression, used to indicate that it is to be considered grouped together. Vinculum is Latin for "bond", "fetter", "chain", or "tie", which is roughly suggestive of some of the uses of the symbol.
Uses
In a repeating decimal, the vinculum is used to indicate the group of repeating digits:
It is used as part of the notation of a radical to indicate the radicand whose root is being indicated. In the next case, the quantity ab + 2 is the radicand, and thus has a vinculum over it:
It is used to show the repeating terms in a periodic continued fraction. Quadratic irrational numbers are the only numbers that have these.
It can be used in signed-digit representation to represent negative digits, such as the following example in balanced ternary:
The vinculum is sometimes used in Boolean algebra, where it serves to indicate a group of expressions whose logical result is to be negated, as in:
It can even be used as a notation to indicate a group (bracket smaller to parenthesis):
meaning to add b and c first and then subtract the result from a.
In particle physics, the vinculum is used to indicate antiparticles. For example, p and p are the symbols for proton and antiproton, respectively.
The vinculum should not be confused with a similar-looking vector notation, e.g. "vector from A to B", or "vector named a".
In mathematics, a quotient is the result of a division. For example, when dividing 6 by 3, the quotient is 2, while 6 is called the dividend, and 3 the divisor. The quotient can also be expressed as the number of times the divisor divides into the dividend.
A quotient can also mean just the integer part of the result of dividing two integers. For example, the quotient of 13 ÷ 5 would be 2 while the remainder would be 3. For more, see the division algorithm.
In more abstract branches of mathematics, the word quotient is often used to describe sets, spaces, or algebraic structures whose elements are the equivalence classes of some equivalence relation on another set, space, or algebraic structure. See:
quotient set
quotient group
quotient ring
quotient space (linear algebra)
quotient space of a topological space
quotient object
right quotient and left quotient (operations on formal languages)
The quotient rule is a method for finding derivatives in calculus.
Quotients also come up in certain tests, like the IQ test, which stands for intelligence quotient. In this case, one's quotient is basically one's score[clarification needed]. In recent decades, as more emphasis has been placed on full personal development, other similar quotients have appeared. These include moral quotient, emotional quotient, adversity quotient, social quotient, creativity quotient, etc.
In mathematics, a divisor of an integer n, also called a factor of n, is an integer which evenly divides n without leaving a remainder.
Explanation
The name "divisor" comes from the arithmetic operation of division: if a / b = c then a is the dividend, b the divisor, and c the quotient.
In general, m | n (read as "m divides n") for non-zero integers m and n iff there exists an integer k such that n = km. Thus, divisors can be negative as well as positive, although sometimes the term is restricted to positive divisors. (For example, there are six divisors of four, 1, 2, 4, −1, −2, −4, but only the positive ones would usually be mentioned, i.e. 1, 2, and 4.)
1 and −1 divide (are divisors of) every integer, every integer (and its negation) is a divisor of itself, and every integer is a divisor of 0, except by convention 0 itself (see also division by zero). Numbers divisible by 2 are called even and numbers not divisible by 2 are called odd.
1, −1, n and −n are known as the trivial divisors of n. A divisor of n that is not a trivial divisor is known as a non-trivial divisor. A number with at least one non-trivial divisor is known as a composite number, while the units -1 and 1 and prime numbers have no non-trivial divisors.
There are divisibility rules which allow one to recognize certain divisors of a number from the number's digits.
Examples
7 is a divisor of 42 because 42 / 7 = 6, so we can say 7 | 42. It can also be said that 42 is divisible by 7, 42 is a multiple of 7, 7 divides 42, or 7 is a factor of 42.
The non-trivial divisors of 6 are 2, −2, 3, −3.
The positive divisors of 42 are 1, 2, 3, 6, 7, 14, 21, 42.
The set of all positive divisors of 60, A = {1,2,3,4,5,6,10,12,15,20,30,60}, partially ordered by divisibility, has the Hasse diagram:
Further notions and facts
There are some elementary rules:
If a | b and b | c, then a | c. This is the transitive relation.
If a | b and b | a, then a = b or a = − b.
If a | b and c | b, then it is NOT always true that (a + c) | b (eg 2|6 and 3|6 but 5 does not divide 6). However, when a | b and a | c, then a | (b + c) is true, as is a | (b − c).[1]
The vertical bar used is a Unicode "Divides" character, code point U+2223. Its negated symbol is ∤. In an ASCII-only environment, the standard vertical bar "|", which is slightly shorter, is often used.
If a | bc, and gcd(a,b) = 1, then a | c. This is called Euclid's lemma.
If p is a prime number and p | ab then p | a or p | b (or both).
A positive divisor of n which is different from n is called a proper divisor or an aliquot part of n. A number that does not evenly divide n but leaves a remainder is called an aliquant part of n.
An integer n > 1 whose only proper divisor is 1 is called a prime number. Equivalently, a prime number is a positive integer which has exactly two positive factors: 1 and itself.
Any positive divisor of n is a product of prime divisors of n raised to some power. This is a consequence of the fundamental theorem of arithmetic.
If a number equals the sum of its proper divisors, it is said to be a perfect number. Numbers less than the sum of their proper divisors are said to be abundant, while numbers greater than that sum are said to be deficient.
The total number of positive divisors of n is a multiplicative function d(n) (e.g. ). The sum of the positive divisors of n is another multiplicative function σ(n) (e.g. ). Both of these functions are examples of divisor functions.
If the prime factorization of n is given by
then the number of positive divisors of n is
and each of the divisors has the form
where for each
It can be shown that for any natural n the inequality holds.
Also it can be shown [2] that
One interpretation of this result is that a randomly chosen positive integer n has an expected number of divisors of about lnn.
Divisibility of numbers
The relation of divisibility turns the set N of non-negative integers into a partially ordered set, in fact into a complete distributive lattice. The largest element of this lattice is 0 and the smallest is 1. The meet operation ^ is given by the greatest common divisor and the join operation v by the least common multiple. This lattice is isomorphic to the dual of the lattice of subgroups of the infinite cyclic group Z.
In mathematics, especially in elementary arithmetic, division (÷) is the arithmetic operation that is the inverse of multiplication.
Specifically, if c times b equals a, written:
where b is not zero, then a divided by b equals c, written:
For instance,
since
.
In the above expression, a is called the dividend, b the divisor and c the quotient.
Conceptually, division describes two distinct but related settings. Partitioning involves taking a set of size a and forming b groups that are equal in size. The size of each group formed, c, is the quotient of a and b. Quotative division involves taking a set of size a and forming groups of size b. The number of groups of this size that can be formed, c, is the quotient of a and b.[1]
Teaching division usually leads to the concept of fractions being introduced to students. Unlike addition, subtraction, and multiplication, the set of all integers is not closed under division. Dividing two integers may result in a remainder. To complete the division of the remainder, the number system is extended to include fractions or rational numbers as they are more generally called.
Notation
Division is often shown in algebra and science by placing the dividend over the divisor with a horizontal line, also called a vinculum or fraction bar, between them. For example, a divided by b is written
This can be read out loud as "a divided by b", "a by b" or "a over b". A way to express division all on one line is to write the dividend, or numerator then a slash, then the divisor, or denominator like this:
This is the usual way to specify division in most computer programming languages since it can easily be typed as a simple sequence of ASCII characters.
A typographical variation, which is halfway between these two forms, uses a solidus (fraction slash) but elevates the dividend, and lowers the divisor:
a⁄b
Any of these forms can be used to display a fraction. A fraction is a division expression where both dividend and divisor are integers (although typically called the numerator and denominator), and there is no implication that the division needs to be evaluated further.
A second way to show division is to use the obelus (or division sign), common in arithmetic, in this manner:
This form is infrequent except in elementary arithmetic. The obelus is also used alone to represent the division operation itself, as for instance as a label on a key of a calculator.
In some non-English-speaking cultures, "a divided by b" is written a : b. However, in English usage the colon is restricted to expressing the related concept of ratios (then "a is to b").
Computing division
Division is often introduced through the notion of "sharing out" a set of objects, for example a pile of sweets, into a number of equal portions. Distributing the objects several at a time in each round of sharing to each portion leads to the idea of "chunking", i.e. division by repeated subtraction.
More systematic and more efficient (but also more formalised and more rule-based, and more removed from an overall holistic picture of what division is achieving), a person who knows the multiplication tables can divide two integers using pencil and paper and the method of short division, if the divisor is simple, or long division for larger integer divisors. If the dividend has a fractional part (expressed as a decimal fraction), we can continue the algorithm past the ones place as far as desired. If the divisor has a fractional part, we can restate the problem by moving the decimal to the right in both numbers until the divisor has no fraction.
Modern computers compute division by methods that are faster than long division: see Division (digital).
A person can calculate division with an abacus by repeatedly placing the dividend on the abacus, and then subtracting the divisor the offset of each digit in the result, counting the number of divisions possible at each offset.
A person can use logarithm tables to divide two numbers, by subtracting the two numbers' logarithms, then looking up the antilogarithm of the result.
A person can calculate division with a slide rule by aligning the divisor on the C scale with the dividend on the D scale. The quotient can be found on the D scale where it is aligned with the left index on the C scale. The user is responsible, however, for mentally keeping track of the decimal point.
In modular arithmetic, some numbers have a multiplicative inverse with respect to the modulus. We can calculate division by multiplication in such a case. This approach is useful in computers that do not have a fast division instruction.
Division algorithm
The division algorithm is mathematical theorem that precisely expresses the outcome of the usual process of division of integers. In particular, the theorem asserts that integers called the quotient q and remainder r always exist and that they are uniquely determined by the dividend a and divisor d, with d ≠ 0. Formally, the theorem is stated as follows: There exist unique integers q and r such that a = qd + r and 0 ≤ r < | d |, where | d | denotes the absolute value of d.
Division of integers
Division of integers is not closed. Apart from division by zero being undefined, the quotient will not be an integer unless the dividend is an integer multiple of the divisor; for example 26 cannot be divided by 10 to give an integer. In such a case there are four possible approaches.
Say that 26 cannot be divided by 10; division becomes a partial function.
Give the answer as a decimal fraction or a mixed number, so or This is the approach usually taken in mathematics.
Give the answer as an integer quotient and a remainder, so
Give the integer quotient as the answer, so This is sometimes called integer division.
One has to be careful when performing division of integers in a computer program. Some programming languages, such as C, will treat division of integers as in case 4 above, so the answer will be an integer. Other languages, such as MATLAB, will first convert the integers to real numbers, and then give a real number as the answer, as in case 2 above.
Names and symbols used for integer division include div, /, \, and %. Definitions vary regarding integer division when the quotient is negative: rounding may be toward zero or toward −∞.
Divisibility rules can sometimes be used to quickly determine whether one integer divides exactly into another.
Division of rational numbers
The result of dividing two rational numbers is another rational number when the divisor is not 0. We may define division of two rational numbers p/q and r/s by
All four quantities are integers, and only p may be 0. This definition ensures that division is the inverse operation of multiplication.
Division of real numbers
Division of two real numbers results in another real number when the divisor is not 0. It is defined such a/b = c if and only if a = cb and b ≠ 0.
Division by zero
Division of any number by zero (where the divisor is zero) is not defined. This is because zero multiplied by any finite number will always result in a product of zero. Entry of such an equation into most calculators will result in an error message being issued.
Division of complex numbers
Dividing two complex numbers results in another complex number when the divisor is not 0, defined thus:
All four quantities are real numbers. r and s may not both be 0.
Division for complex numbers expressed in polar form is simpler than the definition above:
Again all four quantities are real numbers. r may not be 0.
Division of polynomials
One can define the division operation for polynomials. Then, as in the case of integers, one has a remainder. See polynomial long division or synthetic division.
Division of matrices
One can define a division operation for matrices. The usual way to do this is to define A / B = AB−1, where B−1 denotes the inverse of B, but it is far more common to write out AB−1 explicitly to avoid confusion.
Left and right division
Because matrix multiplication is not commutative, one can also define a left division or so-called backslash-division as A \ B = A−1B. For this to be well defined, B−1 need not exist, however A−1 does need to exist. To avoid confusion, division as defined by A / B = AB−1 is sometimes called right division or slash-division in this context.
Note that with left and right division defined this way, A/(BC) is in general not the same as (A/B)/C and nor is (AB)\C the same as A\(B\C), but A/(BC) = (A/C)/B and (AB)\C = B\(A\C).
Matrix division and pseudoinverse
To avoid problems when A−1 and/or B−1 do not exist, division can also be defined as multiplication with the pseudoinverse, i.e., A / B = AB+ and A \ B = A+B, where A+ and B+ denote the pseudoinverse of A and B.
Division in abstract algebra
In abstract algebras such as matrix algebras and quaternion algebras, fractions such as are typically defined as or where b is presumed to be an invertible element (i.e. there exists a multiplicative inverse b− 1 such that bb− 1 = b− 1b = 1 where 1 is the multiplicative identity). In an integral domain where such elements may not exist, division can still be performed on equations of the form ab = ac or ba = ca by left or right cancellation, respectively. More generally "division" in the sense of "cancellation" can be done in any ring with the aforementioned cancellation properties. If such a ring is finite, then by an application of the pigeonhole principle, every nonzero element of the ring is invertible, so division by any nonzero element is possible in such a ring. To learn about when algebras (in the technical sense) have a division operation, refer to the page on division algebras. In particular Bott periodicity can be used to show that any real normed division algebra must be isomorphic to either the real numbers R, the complex numbers C, the quaternions H, or the octonions O.
Division and calculus
The derivative of the quotient of two functions is given by the quotient rule:
There is no general method to integrate the quotient of two functions.
In geometry, an angle is the figure formed by two rays sharing a common endpoint, called the vertex of the angle.[1] The magnitude of the angle is the "amount of rotation" that separates the two rays, and can be measured by considering the length of circular arc swept out when one ray is rotated about the vertex to coincide with the other (see "Measuring angles", below). Where there is no possibility of confusion, the term "angle" is used interchangeably for both the geometric configuration itself and for its angular magnitude (which is simply a numerical quantity).
The word angle comes from the Latin word angulus, meaning "a corner". The word angulus is a diminutive, of which the primitive form, angus, does not occur in Latin. Cognate words are the Greek ἀγκύλος(ankylοs), meaning "crooked, curved," and the English word "ankle." Both are connected with the Proto-Indo-European root *ank-, meaning "to bend" or "bow".[2]
Euclid defines a plane angle as the inclination to each other, in a plane, of two lines which meet each other, and do not lie straight with respect to each other. According to Proclus an angle must be either a quality or a quantity, or a relationship. The first concept was used by Eudemus, who regarded an angle as a deviation from a straight line; the second by Carpus of Antioch, who regarded it as the interval or space between the intersecting lines; Euclid adopted the third concept, although his definitions of right, acute, and obtuse angles are certainly quantitative.
Measuring angles
Two angles are sometimes called congruent if there exists an isometry that transforms one of the angles into the other angle. The size of an angle is normally characterized by the smallest positive rotation that maps one of the rays into the other. Two angles are congruent if and only if they correspond to the same (smallest positive) rotation. Thus an angle as two rays is characterized by an angle of rotation. To avoid confusion when no isometry exists between particular representations of angles, angles that Euclid called "equal" are described as "equal in measure".
In many geometrical situations, angles that differ by an exact multiple of a full circle are effectively equivalent (it makes no difference how many times a line is rotated through a full circle because it always ends up in the same place). However, this is not always the case. For example, when tracing a curve such as a spiral using polar coordinates, an extra full turn gives rise to a quite different point on the curve.
The angle θ is the quotient of s and r.
In order to measure an angle θ, a circular arc centered at the vertex of the angle is drawn, e.g. with a pair of compasses. The length of the arc s is then divided by the radius of the circle r, and possibly multiplied by a scaling constant k (which depends on the units of measurement that are chosen):
The value of θ thus defined is independent of the size of the circle: if the length of the radius is changed then the arc length changes in the same proportion, so the ratio s/r is unaltered.
Units
In dimensional analysis, angles are considered to be dimensionless. There are several units used to measure angles, depending on the choice of the constant k in the formula above. Of these units, treated in more detail below, the degree and the radian are by far the most common.
With the notable exception of the radian, most units of angular measurement are defined such that one full circle (i.e. one revolution) is equal to n units, for some whole number n. For example, in the case of degrees, n = 360. A full circle of n units is obtained by setting k = n/(2π) in the formula above. (Proof. The formula above can be rewritten as k = θr/s. One full circle, for which θ = n units, corresponds to an arc equal in length to the circle's circumference, which is 2πr, so s = 2πr. Substituting n for θ and 2πr for s in the formula, results in k = nr/(2πr) = n/(2π).)
The degree, denoted by a small superscript circle (°), is 1/360 of a full circle, so one full circle is 360°. One advantage of this old sexagesimal subunit is that many angles common in simple geometry are measured as a whole number of degrees. Fractions of a degree may be written in normal decimal notation (e.g. 3.5° for three and a half degrees), but the following sexagesimal subunits of the "degree-minute-second" system are also in use, especially for geographical coordinates and in astronomy and ballistics:
The minute of arc (or MOA, arcminute, or just minute) is 1/60 of a degree. It is denoted by a single prime ( ′ ). For example, 3° 30′ is equal to 3 + 30/60 degrees, or 3.5 degrees. A mixed format with decimal fractions is also sometimes used, e.g. 3° 5.72′ = 3 + 5.72/60 degrees. A nautical mile was historically defined as a minute of arc along a great circle of the Earth.
The second of arc (or arcsecond, or just second) is 1/60 of a minute of arc and 1/3600 of a degree. It is denoted by a double prime ( ″ ). For example, 3° 7′ 30″ is equal to 3 + 7/60 + 30/3600 degrees, or 3.125 degrees.
θ = s/r rad = 1 rad.
The angle of the equilateral triangle is 1/6 of a full circle. It was the unit used by the Babylonians[citation needed], and is especially easy to construct with ruler and compasses. The degree, minute of arc and second of arc are sexagesimal subunits of the Babylonian unit. 1 Babylonian unit = 60° = π/3 rad ≈ 1.047197551 rad.
The radian is the angle subtended by an arc of a circle that has the same length as the circle's radius (k = 1 in the formula given earlier). One full circle is 2π radians, and one radian is 180/π degrees, or about 57.2958 degrees. The radian is abbreviated rad, though this symbol is often omitted in mathematical texts, where radians are assumed unless specified otherwise. When radians are used angles are considered as dimensionless. The radian is used in virtually all mathematical work beyond simple practical geometry, due, for example, to the pleasing and "natural" properties that the trigonometric functions display when their arguments are in radians. The radian is the (derived) unit of angular measurement in the SI system.
The mil is approximately equal to a milliradian. There are several definitions.
The turn (or full circle, revolution, rotation, or cycle) is one full circle. A turn is subdivided in centiturns and milliturns. The revolution and rotation are abbreviated rev and rot, respectively, but just r in rpm (revolutions per minute). 1 turn = 360° = 2π rad = 400 gon = 4 right angles.
The point, used in navigation, is 1/32 of a full circle. It is a binary subunit of the turn. Naming all 32 points on a compass rose is called "boxing the compass". 1 point = 1/8 of a right angle = 11.25° = 12.5 gon. Each point is subdivided in four quarter-points so that 1 turn equals 128 quarter-points.
The binary degree, also known as the binary radian (or brad), is 1/256 of a full circle.[4] The binary degree is used in computing so that an angle can be efficiently represented in a single byte (albeit to limited precision). Other measures of angle used in computing may be based on dividing one whole turn into 2n equal parts for other values of n.[5]
The quadrant is 1/4 of a full circle, i.e. a right angle. It is the unit used in Euclid's Elements. 1 quad. = 90° = π/2 rad = 1/4 turn= 100 gon.
The grad, also called grade, gradian, or gon is 1/400 of a turn, so one full circle is 400 grads and a right angle is 100 grads. It is a decimal subunit of the quadrant. A kilometre was historically defined as a centi-gon of arc along a great circle of the Earth, so the kilometre is the decimal analog to the sexagesimal nautical mile. The gon is used mostly in triangulation.
The astronomical hour angle is 1/24 of a full circle. Since this system is amenable to measuring objects that cycle once per day (such as the relative position of stars), the sexagesimal subunits are called minute of time and second of time. Note that these are distinct from, and 15 times larger than, minutes and seconds of arc. 1 hour = 15° = π/12 rad = 1/6 quad. = 1/24 turn ≈ 16.667 gon.
Positive and negative angles
In mathematics the angle from the first to the second coordinate axis of a coordinate system is considered as positive. Therefore angles given a sign are positive angles if measured anticlockwise, and negative angles if measured clockwise, from a given line. If no line is specified, it can be assumed to be the first coordinate axis (x-axis) in the Cartesian plane. In many geometrical situations a negative angle of −θ is effectively equivalent to a positive angle of "one full rotation less θ". For example, a clockwise rotation of 45° (that is, an angle of −45°) is often effectively equivalent to an anticlockwise rotation of 360° − 45° (that is, an angle of 315°).
In three dimensional geometry, "clockwise" and "anticlockwise" have no absolute meaning, so the direction of positive and negative angles must be defined relative to some reference, which is typically a vector passing through the angle's vertex and perpendicular to the plane in which the rays of the angle lie.
In navigation, bearings are measured from north, increasing clockwise, so a bearing of 45 degrees is north-east. Negative bearings are not used in navigation, so north-west is 315 degrees.
Alternative ways of measuring the size of an angle
There are several alternatives to measuring the size of an angle by the corresponding angle of rotation. The grade of a slope, or gradient is equal to the tangent of the angle, or sometimes the sine. Gradients are often expressed as a percentage. For very small values (less than 5%), the grade of a slope is approximately the measure of an angle in radians.
In rational geometry the spread between two lines is defined at the square of sine of the angle between the lines. Since the sine of an angle and the sine of its supplementary angle are the same any angle of rotation that maps one of the lines into the other leads to the same value of the spread between the lines.
Approximations
1° is approximately the width of a little finger at arm's length.
10° is approximately the width of a closed fist at arm's length.
20° is approximately the width of a handspan at arm's length.
These measurements clearly depend on the individual subject, and the above should be treated as rough approximations only.
Identifying angles
In mathematical expressions, it is common to use Greek letters (α, β, γ, θ, φ, ...) to serve as variables standing for the size of some angle. (To avoid confusion with its other meaning, the symbol π is typically not used for this purpose.) Lower case roman letters (a, b, c, ...) are also used. See the figures in this article for examples.
In geometric figures, angles may also be identified by the labels attached to the three points that define them. For example, the angle at vertex A enclosed by the rays AB and AC (i.e. the lines from point A to point B and point A to point C) is denoted ∠BAC or BÂC. Sometimes, where there is no risk of confusion, the angle may be referred to simply by its vertex ("angle A").
Potentially, an angle denoted, say, ∠BAC might refer to any of four angles: the clockwise angle from B to C, the anticlockwise angle from B to C, the clockwise angle from C to B, or the anticlockwise angle from C to B, where the direction in which the angle is measured determines its sign (see Positive and negative angles). However, in many geometrical situations it is obvious from context that the positive angle less than or equal to 180° degrees is meant, and no ambiguity arises. Otherwise, a convention may be adopted so that ∠BAC always refers to the anticlockwise (positive) angle from B to C, and ∠CAB to the anticlockwise (positive) angle from C to B.
Types of angles
Right angle.
Reflex angle.
The complementary angles a and b (b is the complement of a, and a is the complement of b).
Acute (a), obtuse (b), and straight (c) angles. Here, a and b are supplementary angles.
An angle of 90° (π/2 radians, or 1/4 turn) is called a right angle.
Two lines that form a right angle are said to be perpendicular or orthogonal.
Angles equal to 1/2 turn (180° or two right angles) are called straight angles.
Angles that are not right angles or a multiple of a right angle are called oblique angles.
Angles smaller than a right angle (less than 90°) are called acute angles ("acute" meaning "sharp").
Angles larger than a right angle and smaller than a straight angle (between 90° and 180°) are called obtuse angles ("obtuse" meaning "blunt").
Angles larger than a straight angle but less than 1 turn (between 180° and 360°) are called reflex angles.
Angles that have the same measure (i.e. the same magnitude) are sometimes said to be congruent. Following this definition for congruent angles, an angle is defined by its measure and is not dependent upon the lengths of the sides of the angle (e.g. all right angles are congruent).
Two angles opposite each other, formed by two intersecting straight lines that form an "X"-like shape, are called vertical angles or opposite angles or vertically opposite angles. These angles are equal in measure.
Angles that share a common vertex and edge but do not share any interior points are called adjacent angles.
Two angles that sum to one right angle (90°) are called complementary angles.
The difference between an angle and a right angle is termed the complement of the angle.
Two angles that sum to a straight angle (180°) are called supplementary angles.
The difference between an angle and a straight angle (180°) is termed the supplement of the angle.
Two angles that sum to one turn (360°) are called explementary angles or conjugate angles.
An angle that is part of a simple polygon is called an interior angle if it lies on the inside of that simple polygon. A concave simple polygon has at least one interior angle that exceeds 180°.
In Euclidean geometry, the measures of the interior angles of a triangle add up to π radians, or 180°, or 1/2 turn; the measures of the interior angles of a simple quadrilateral add up to 2π radians, or 360°, or 1 turn. In general, the measures of the interior angles of a simple polygon with n sides add up to [(n − 2) × π] radians, or [(n − 2) × 180]°, or (2n − 4) right angles, or (n/2 − 1) turn.
The angle supplementary to the interior angle is called the exterior angle. It measures the amount of "turn" one has to make at this vertex to trace out the polygon. If the corresponding interior angle is a reflex angle, the exterior angle should be considered negative. Even in a non-simple polygon it may be possible to define the exterior angle, but one will have to pick an orientation of the plane (or surface) to decide the sign of the exterior angle measure.
In Euclidean geometry, the sum of the exterior angles of a simple polygon will be one full turn (360°).
Some authors use the name exterior angle of a simple polygon to simply mean the explementary (not supplementary!) of the interior angle.[6] This conflicts with the above usage.
The angle between two planes (such as two adjacent faces of a polyhedron) is called a dihedral angle. It may be defined as the acute angle between two lines normal to the planes.
The angle between a plane and an intersecting straight line is equal to ninety degrees minus the angle between the intersecting line and the line that goes through the point of intersection and is normal to the plane.
If a straight transversal line intersects two parallel lines, corresponding (as well as alternate) angles at the two points of intersection are equal in size; adjacent angles are supplementary (that is, their measures add to π radians, or 180°).
A reference angle is the acute version of any angle determined by repeatedly subtracting or adding 180 degrees, and subracting the result from 180 degrees if necessary, until a value between 0 degrees and 90 degrees is obtained. For example, an angle of 30 degrees has a reference angle of 30 degrees, and an angle of 150 degrees also has a reference angle of 30 degrees (180-150). An angle of 750 degrees has a reference angle of 30 degrees (750-720).[7]
A formal definition
Using trigonometric functions
A Euclidean angle is completely determined by the corresponding right triangle. In particular, if θ is a Euclidean angle, it is true that
and
for two numbers x and y. So an angle in the Euclidean plane can be legitimately given by two numbers x and y.
To the ratio y/x there correspond two angles in the geometric range 0 < θ < 2π, since
Using rotations
Suppose we have two unit vectors and in the euclidean plane . Then there exists one positive isometry (a rotation), and one only, from to that maps u onto v. Let r be such a rotation. Then the relation defined by is an equivalence relation and we call angle of the rotation r the equivalence class , where denotes the unit circle of . The angle between two vectors will simply be the angle of the rotation that maps one onto the other. We have no numerical way of determining an angle yet. To do this, we choose the vector (1,0), then for any point M on at distance θ from (1,0) (on the circle), let . If we call rθ the rotation that transforms (1,0) into , then is a bijection, which means we can identify any angle with a number between 0 and .
Angles between curves
The angle between the two curves at P is defined as the angle between the tangents A and B at P
The angle between a line and a curve (mixed angle) or between two intersecting curves (curvilinear angle) is defined to be the angle between the tangents at the point of intersection. Various names (now rarely, if ever, used) have been given to particular cases:—amphicyrtic (Gr. ἀμφί, on both sides, κυρτός, convex) or cissoidal (Gr. κισσός, ivy), biconvex; xystroidal or sistroidal (Gr. ξυστρίς, a tool for scraping), concavo-convex; amphicoelic (Gr. κοίλη, a hollow) or angulus lunularis, biconcave.[8]
Dot product and generalisation
In the Euclidean plane, the angle θ between two vectorsu and v is related to their dot product and their lengths by the formula
This formula supplies an easy method to find the angle between two planes (or curved surfaces) from their normal vectors and between skew lines from their vector equations.
Inner product
To define angles in an abstract real inner product space, we replace the Euclidean dot product ( · ) by the inner product , i.e.
In a complex inner product space, the expression for the cosine above may give non-real values, so it is replaced with
or, more commonly, using the absolute value, with
The latter definition ignores the direction of the vectors and thus describes the angle between one-dimensional subspaces and spanned by the vectors and correspondingly.
Angles between subspaces
The definition of the angle between one-dimensional subspaces and given by
in a Hilbert space can be extended to subspaces of any finite dimensions. Given two subspaces with , this leads to a definition of k angles called canonical or principal angles between subspaces.
Angles in Riemannian geometry
In Riemannian geometry, the metric tensor is used to define the angle between two tangents. Where U and V are tangent vectors and gij are the components of the metric tensor G,
Angles in geography and astronomy
In geography, the location of any point on the Earth can be identified using a geographic coordinate system. This system specifies the latitude and longitude of any location in terms of angles subtended at the centre of the Earth, using the equator and (usually) the Greenwich meridian as references.
In astronomy, a given point on the celestial sphere (that is, the apparent position of an astronomical object) can be identified using any of several astronomical coordinate systems, where the references vary according to the particular system. Astronomers measure the angular separation of two stars by imagining two lines through the centre of the Earth, each intersecting one of the stars. The angle between those lines can be measured, and is the angular separation between the two stars.
Astronomers also measure the apparent size of objects as an angular diameter. For example, the full moon has an angular diameter of approximately 0.5°, when viewed from Earth. One could say, "The Moon subtends an angle of half a degree." The small-angle formula can be used to convert such an angular measurement into a distance/size ratio.
Parallelism is a term in geometry and in everyday life that refers to a property in Euclidean space of two or more lines or planes, or a combination of these. The existence and properties of parallel lines are the basis of Euclid's parallel postulate. Two lines in a plane that do not intersect or meet are called parallel lines.
Symbol
The parallel symbol is . For example, indicates that line AB is parallel to line CD.
In the Unicode character set, the 'parallel' and 'not parallel' signs have codepoints U+2225 (∥) and U+2226 (∦) respectively. They are part of the Mathematical Operators range.
Euclidean parallelism
As shown by the tick marks, lines a and b are parallel. This can be proved because the transversal t produces congruent angles.
Given straight lines l and m, the following descriptions of line m equivalently define it as parallel to line l in Euclidean space:
Every point on line m is located exactly the same minimum distance from line l (equidistant lines).
Line m is on the same plane as line l but does not intersect l (even assuming that lines extend to infinity in either direction).
Lines m and l are both intersected by a third straight line (a transversal) in the same plane, and the corresponding angles of intersection with the transversal are equal. (This is equivalent to Euclid's parallel postulate.)
In other words, parallel lines must be located in the same plane, and parallel planes must be located in the same three-dimensional space. A parallel combination of a line and a plane may be located in the same three-dimensional space. Lines parallel to each other have the same gradient. Compare to perpendicular.
Construction
The three definitions above lead to three different methods of construction of parallel lines.
The problem: Draw a line through a parallel to l.
Definition 1: Line m has everywhere the same distance to line l.
Definition 2: Take a random line through a that intersects l in x. Move point x to infinity.
Definition 3: Both l and m share a transversal line through a that intersect them at 90°.
Another definition of parallel line that's often used is that two lines are parallel if they do not intersect, though this definition applies only in the 2-dimensional plane. Another easy way is to remember that a parallel line is a line that has an equal distance with the opposite line.
Distance between two parallel lines
Because a parallel line is a line that has an equal distance with the opposite line, there is a unique distance between the two parallel lines. Given the equations of two non-vertical parallel lines:
the distance between the two lines can be formulated by the following formula:
Also if 2 lines are
the distance between the two lines can be formulated by the following formula:
Extension to non-Euclidean geometry
In non-Euclidean geometry it is more common to talk about geodesics than (straight) lines. A geodesic is the path that a particle follows if no force is applied to it. In non-Euclidean geometry (spherical or hyperbolic) the above three definitions are not equivalent: only the second one is useful in other geometries. In general, equidistant lines are not geodesics so the equidistant definition cannot be used. In the Euclidean plane, when two geodesics (straight lines) are intersected with the same angles by a transversal geodesic (see image), every (non-parallel) geodesic intersects them with the same angles. In both the hyperbolic and spherical plane, this is not the case. For example, geodesics sharing a common perpendicular only do so at one point (hyperbolic space) or at two (antipodal) points (spherical space).
In general geometry it is useful to distinguish the three definitions above as three different types of lines, respectively equidistant lines, parallel geodesics and geodesics sharing a common perpendicular.
While in Euclidean geometry two geodesics can either intersect or be parallel, in general and in hyperbolic space in particular there are three possibilities. Two geodesics can be either:
intersecting: they intersect in a common point in the plane
parallel: they do not intersect in the plane, but do in the limit to infinity
ultra parallel: they do not even intersect in the limit to infinity
In the literature ultra parallel geodesics are often called parallel. Geodesics intersecting at infinity are then called limit geodesics.
Spherical
On the spherical plane there is no such thing as a parallel line. Line a is a great circle, the equivalent of a straight line in the spherical plane. Line c is equidistant to line a but is not a great circle. It is a parallel of latitude. Line b is another geodesic which intersects a in two antipodal points. They share two common perpendiculars (one shown in blue).
In the spherical plane, all geodesics are great circles. Great circles divide the sphere in two equal hemispheres and all great circles intersect each other. By the above definitions, there are no parallel geodesics to a given geodesic, all geodesics intersect. Equidistant lines on the sphere are called parallels of latitude in analog to latitude lines on a globe. Parallel lines in Euclidean space are straight lines; equidistant lines are not geodesics and therefore are not directly analogous to straight lines in the Euclidean space. An object traveling along such a line has to accelerate away from the geodesic to which it is equidistant to avoid intersecting with it. When embedded in Euclidean space a dimension higher, parallels of latitude can be generated by the intersection of the sphere with a plane parallel to a plane through the center.
Hyperbolic
In the hyperbolic plane, there are two lines through a given point that intersect a given line in the limit to infinity. While in Euclidean geometry a geodesic intersects its parallels in both directions in the limit to infinity, in hyperbolic geometry both directions have their own line of parallelism. When visualized on a plane a geodesic is said to have a left-handed parallel and a right-handed parallel through a given point. The angle the parallel lines make with the perpendicular from that point to the given line is called the angle of parallelism. The angle of parallelism depends on the distance of the point to the line with respect to the curvature of the space. The angle is also present in the Euclidean case, there it is always 90° so the left and right-handed parallels coincide. The parallel lines divide the set of geodesics through the point in two sets: intersecting geodesics that intersect the given line in the hyperbolic plane, and ultra parallel geodesics that do not intersect even in the limit to infinity (in either direction). In the Euclidean limit the latter set is empty.
Intersecting, parallel and ultra parallel lines through a with respect to l in the hyperbolic plane. The parallel lines appear to intersect l just off the image. This is an artifact of the visualisation. It is not possible to isometrically embed the hyperbolic plane in three dimension. In a real hyperbolic space the line will get closer to each other and 'touch' in infinity.
In Euclidean geometry, a line is a straight curve. When geometry is used to model the real world, lines are used to represent straight objects with negligible width and height. Lines are an idealisation of such objects and have no width or height at all and are usually considered to be infinitely long. Lines are a fundamental concept in some approaches to geometry such as Euclid's, but in others such as analytic geometry and Tarski's axioms they enter as derived notions defined in terms of more fundamental primitives such as points.
A line segment is a part of a line that is bounded by two distinct end points and contains every point on the line between its end points. Depending on how the line segment is defined, either of the two end points may or may not be part of the line segment. Two or more line segments may have some of the same relationships as lines, such as being parallel, intersecting, or skew.
Euclidean geometry
When geometry was first formalised by Euclid in Elements, he defined lines to be "breadthless length" with a straight line being a line "which lies evenly with the points on itself".[1] These definitions serve little purpose since they use terms which are not, themselves, defined. In fact, Euclid did not use these definitions in work and probably included them just to make it clear to the reader what was being discussed. In modern geometry, a line is simply taken as an undefined object with properties given by postulates.[2]
In an axiomatic formulation of Euclidean geometry, such as that of Hilbert (Euclid's original axioms contained various flaws which have been corrected by modern mathematicians),[3] a line is stated to have certain properties which relate it to other lines and points. For example, for any two distinct points, there is a unique line containing them, and any two distinct lines intersect at most one point.[4] In two dimensions, ie. the Euclidean plane, two lines which do not intersect are called parallel. In higher dimensions, two lines that do not intersect may be parallel if they are contained in a plane, or skew if they are not.
Any collection of finitely many lines partitions the plane into convex polygons (possibly unbounded); this partition is known as an arrangement of lines.
Ray
If concept of "order" of points of a line is defined, a ray, or half-line, may be defined as well. A ray is part of a line which is finite in one direction, but infinite in the other. It can be defined by two points, the initial point, A, and one other, B. The ray is all the points in the line segment between A and B together with all points, C, on the line through A and B such that the points appear on the line in the order A, B, C.[5]
In topology, a ray in a space X is a continuous embedding R+ → X. It is used to define the important concept of end of the space.
Coordinate geometry
In coordinate geometry, lines in a Cartesian plane can be described algebraically by linear equations and linear functions. In two dimensions, the characteristic equation is often given by the slope-intercept form:
where:
m is the slope of the line.
c is the y-intercept of the line.
x is the independent variable of the function y.
The gradient (slope, m) of the line through points A(ax, ay) and B(bx, by) is given by m = (by-ay)/(bx-ax) and the Cartesian equation of this line can be written: y - ay = m(x - ax)
In three dimensions, a line is described by parametric equations:
where:
x, y, and z are all functions of the independent variable t.
x0, y0, and z0 are the initial values of each respective variable (or (x0, y0, z0) is any point on the line).
a, b, and c are related to the slope of the line, such that the vector (a, b, c) is a parallel to the line.
In R2, every line L is described by a linear equation of the form
with fixed real coefficients a, b and c such that a and b are not both zero (see Linear equation for other forms). Important properties of these lines are their slope, x-intercept and y-intercept. The eccentricity of a straight line is infinity.
Vector equation
The vector equation of the line through points A and B is given by r = OA + λAB(where λ is a scalar multiple)
If a is vector OA and b is vector OB, then the equation of the line can be written: r = a + λ(b - a)
Collinear points
Points A (ax,ay) , B (bx,by) and C (cx,cy) are collinear (in the same straight line) if vector AB is parallel to vector BC, or, equivalently, if the gradients of lines AB and AC are equal.
i.e. A, B & C are collinear if and only if: (cy-by)/(cx-bx) = (by-ay)/(bx-ax)
Euclidean space
In Euclidean space Rn (and analogously in all other vector spaces), we define a line L as a subset of the form
where a and b are given vectors in Rn with b non-zero. The vector b describes the direction of the line, and a is a point on the line. Different choices of a and b can yield the same line.
Projective geometry
In projective geometry, a line is similar to that in Euclidean geometry but has slightly different properties. In many models of projective geometry, the idea of the line rarely conforms to the notion of the "straight curve" as it is visualised in Euclidean geometry. The Poincaré half-plane model is a typical example of this.
Geodesics
The "straightness" of a line, interpreted as the property that it minimizes distances between its points, can be generalized and leads to the concept of geodesics on differentiable manifolds.
In geometry, a locus (Latin for "place", plural loci) is a collection of points which share a property. For example a circle may be defined as the locus of points in a plane at a fixed distance from a given point.
A locus may alternatively be described as the path through which a point moves to fulfill a given condition or conditions. So, for example, a circle may also be defined as the locus of a point moving so as to remain at a given distance from a fixed point.
The term occurs in complex dynamics as:
Bifurcation locus
Connectedness locus
Proofs involving loci
In general, a proof that a locus is a particular curve has two parts. The first part is to show that every point on the curve satisfies the condition of the locus and second part is to show that every point that satisfies the condition is on the curve. For example, to show that the locus of points equidistant between two given points is their perpendicular bisector, one must show:
Any point equidistant from the two points lies on the perpendicular bisector,
Any point on this line is equidistant from the two given points.
In geometry, topology and related branches of mathematics a spatial point describes a specific object within a given space that consists of neither volume, area, length, nor any other higher dimensional analogue. Thus, a point is a 0-dimensional object. Because of their nature as one of the simplest geometric concepts, they are often used in one form or another as the fundamental constituents of geometry, physics, vector graphics, and many other fields.
Points in Euclidean geometry
A finite set of points (blue) in two dimensional Euclidean space.
Points are most often considered within the framework of Euclidean geometry, where they are one of the fundamental objects. Euclid originally defined the point vaguely, as "that which has no part". In two dimensional Euclidean space, a point is represented by an ordered pair, , of numbers, where the first number conventionally represents the horizontal and is often denoted by , and the second number conventionally represents the vertical and is often denoted by . This idea is easily generalized to three dimensional Euclidean space, where a point is represented by an ordered triplet, , with the additional third number representing depth and often denoted by z. Further generalizations are represented by an ordered tuplet of n terms, where n is the dimension of the space in which the point is located.
Many constructs within Euclidean geometry consist of an infinite collection of points that conform to certain axioms. This is usually represented by a set of points; As an example, a line is an infinite set of points of the form , where through and are constants and n is the dimension of the space. Similar constructions exist that define the plane, line segment and other related concepts.
In addition to defining points and constructs related to points, Euclid also postulated a key idea about points; he claimed that any two points can be connected by a straight line. This is easily confirmed under modern expansions of Euclidean geometry, and had lasting consequences at its introduction, allowing the construction of almost all the geometric concepts of the time. However, Euclid's postulation of points was neither complete nor definitive, as he occasionally assumed facts about points that didn't follow directly from his axioms, such as the ordering of points on the line or the existence of specific points. In spite of this, modern expansions of the system serve to remove these assumptions.
Points in branches of mathematics
A point in point-set topology is defined as a member of the underlying set of a topological space.
Although the notion of a point is generally considered fundamental in mainstream geometry and topology, there are some systems that forego it, e.g. noncommutative geometry and pointless topology. A “pointless space” is defined not as a set, but via some structure (algebraic or logical respectively) which looks like a well-known function space on the set: an algebra of continuous functions or an algebra of sets respectively. More precisely, such structures generalize well-known spaces of functions in a way that the operation “take a value at this point” may not be defined.
A paradox is a statement or group of statements that leads to a contradiction or a situation which defies intuition. The term is also used for an apparent contradiction that actually expresses a non-dual truth (cf. kōan, Catuskoti). Typically, the statements in question do not really imply the contradiction, the puzzling result is not really a contradiction, or the premises themselves are not all really true or cannot all be true together. The word paradox is often used interchangeably with contradiction. It is also used to describe situations that are ironic.[1]
But many paradoxes, such as Curry's paradox, do not yet have universally accepted resolutions.
Sometimes the term paradox is used for situations that are merely surprising. The birthday paradox, for instance, is unexpected but perfectly logical. The logician Willard V. O. Quine distinguishes falsidical paradoxes, which are seemingly valid, logical demonstrations of absurdities, from veridical paradoxes, such as the birthday paradox, which are seeming absurdities that are nevertheless true.[2] Paradoxes in economics tend to be the veridical type, typically counterintuitive outcomes of economic theory, such as Simpson's paradox. In literature a paradox can be any contradictory or obviously untrue statement, which resolves itself upon later inspection.
Logical paradox
Common themes in paradoxes include self-reference, the infinite regress, circular definitions, and confusion between different levels of abstraction.
Patrick Hughes outlines three laws of the paradox:[3]
Self reference - An example is "This statement is false", a form of the Liar paradox. The statement is referring to itself. Another example of self reference is the question of whether the barber shaves himself in the Barber paradox. One more example would be "Is the answer to this question no?" In this case, if you replied no, you would be stating that the answer is not no. If you reply yes, you are stating that it is no, because you said yes.
Contradiction - "This statement is false"—the statement cannot be false and true at the same time.
Vicious circularity or infinite regress - "This statement is false"—if the statement is true, then the statement is false, thereby making the statement true. Another example of vicious circularity is the following group of statements:
"The statement below is false."
"The statement above is true."
Other paradoxes involve false statements or half-truths and the resulting biased assumptions.
For example, consider a situation in which a father and his son are driving down the road. The car collides with a tree and the father is killed. The boy is rushed to the nearest hospital where he is prepared for emergency surgery. On entering the surgery suite, the surgeon says, "I can't operate on this boy. He's my son."
The apparent paradox is caused by a hasty generalization; if the surgeon is the boy's father, the statement cannot be true. The paradox is resolved if it is revealed that the surgeon is a woman, the boy's mother.
Paradoxes which are not based on a hidden error generally happen at the fringes of context or language, and require extending the context or language to lose their paradoxical quality. Paradoxes that arise from apparently intelligible uses of language are often of interest to logicians and philosophers. This sentence is false is an example of the famous liar paradox: it is a sentence which cannot be consistently interpreted as true or false, because if it is known to be false then it is known that it must be true, and if it is known to be true then it is known that it must be false. Therefore, it can be concluded that it is unknowable. Russell's paradox, which shows that the notion of the set of all those sets that do not contain themselves leads to a contradiction, was instrumental in the development of modern logic and set theory.
Thought experiments can also yield interesting paradoxes. The grandfather paradox, for example, would arise if a time traveler were to kill his own grandfather before his father was conceived, thereby preventing his own birth. This paradox can be resolved by postulating that time travel leads to parallel or bifurcating universes, or that only contradiction-free timelines are stable.
W. V. Quine (1962) distinguished between three classes of paradoxes:
A veridical paradox produces a result that appears absurd but is demonstrated to be true nevertheless. Thus, the paradox of Frederic's birthday in The Pirates of Penzance establishes the surprising fact that a twenty-one-year-old would have had only five birthdays, if he was born on a leap day. Likewise, Arrow's impossibility theorem demonstrates difficulties in mapping voting results to the will of the people.
A falsidical paradox establishes a result that not only appears false but actually is false due to a fallacy in the demonstration. The various invalid mathematical proofs (e.g., that 1 = 2) are classic examples, generally relying on a hidden division by zero. Another example is the inductive form of the Horse paradox, falsely generalizes from true specific statements.
A paradox which is in neither class may be an antinomy, which reaches a self-contradictory result by properly applying accepted ways of reasoning. For example, the Grelling–Nelson paradox points out genuine problems in our understanding of the ideas of truth and description.
A fourth kind has sometimes been described since Quine's work.
A paradox which is both true and false at the same time in the same sense is called a dialetheism. In Western logics it is often assumed, following Aristotle, that no dialetheia exist, but they are sometimes accepted in Eastern traditions[which?] and in paraconsistent logics. An example might be to affirm or deny the statement "John is in the room" when John is standing precisely halfway through the doorway. It is reasonable (by human thinking) to both affirm and deny it ("well, he is, but he isn't"), and it is also reasonable to say that he is neither ("he's halfway in the room, which is neither in nor out"), despite the fact that the statement is to be exclusively proven or disproven.
Paradox in literature
The paradox as a literary device has been defined as an anomalous juxtaposition of incongruous ideas for the sake of striking exposition or unorthodox insight. It functions as a method of literary analysis which involves examining apparently contradictory statements and drawing conclusions either to reconcile them or to explain their presence.[4]
Literary or rhetorical paradoxes abound in the works of Oscar Wilde and G. K. Chesterton; other literature deals with paradox of situation. Rabelais, Cervantes, Sterne, Borges, and Chesterton are all concerned with episodes and narratives designed around paradoxes. Statements such as Wilde's "I can resist anything except temptation" and Chesterton's "spies do not look like spies" are examples of rhetorical paradox. Further back, Polonius' observation in Hamlet that "though this be madness, yet there is method in't" is a memorable third.[4]
A taste for paradox is central to the philosophies of Laozi, Heraclitus, Meister Eckhart, Kierkegaard, Nietzsche, and Tom Robbins, among many others.
Moral paradox
In moral philosophy, paradox in a loose sense plays a role in ethics debates. For instance, it may be considered that an ethical admonition to "love thy neighbour" is not just in contrast with, but in contradiction to armed neighbors actively intending murder. If the hostile neighbors succeed, it is impossible to follow the dictum. On the other hand, to attack, fight back, or restrain them is also not usually considered loving. This might be better termed an ethical dilemma rather than a paradox in the strict sense. However, for this to be a true example of a moral paradox, it must be assumed that "loving" and restraint cannot co-exist. In reality, this situation occurs often, notably when parents punish children out of love[citation needed].
Another example is the conflict between a moral injunction and a duty that cannot be fulfilled without violating that injunction. For example, take the situation of a parent with children who must be fed (the duty), but cannot afford to do so without stealing, which would be wrong (the injunction). Such a conflict between two maxims is normally resolved through weakening one or the other of them: the need for survival is greater than the need to abide by the law. However, as maxims are added for consideration, the questions of which to weaken in the general case and by how much pose issues related to Arrow's impossibility theorem; it may not be possible to formulate a consistent system of ethics rules with a definite order of preference in the general case, a so-called "ethical calculus".
Paradoxes in a more strict sense have been relatively neglected in philosophical discussion within ethics, as compared to their role in other philosophical fields such as logic, epistemology, metaphysics, or even the philosophy of science. Important book-length discussions appear in Derek Parfit's Reasons and Persons and in Saul Smilansky's 10 Moral Paradoxes.
In quantified modal logic, the Barcan formula and the converse Barcan formula (more accurately, schemata rather than formulae) (i) syntactically state principles or interchange between quantifiers and modalities; (ii) semantically state a relation between domains of possible worlds. The formulae were introduced as axioms by Ruth Barcan Marcus, in the first extensions of modal propositional logic to include quantification. [1]
Related formulas include the Buridan formula, and the converse Buridan formula.
The Barcan formula
The Barcan formula is:
.
In English, the schema reads: If everything is necessarily F, then it is necessary that everything is F. It is equivalent to
.
The Barcan formula has generated some controversy because it implies that all objects which exist in every possible world (accessible to the actual world) exist in the actual world, i.e. that domains cannot grow when one moves to accessible worlds. This thesis is sometimes known as actualism--i.e. that there are no merely possible individuals. There is some debate as to the informal interpretation of the Barcan formula and its converse.
Converse Barcan formula
The converse Barcan formula is:
.
If a frame is based on a symmetric accessibility relation, then the Barcan formula will be valid in the frame if, and only if, the converse Barcan formula is valid in the frame. It states that domains cannot shrink as one moves to accessible worlds, i.e. that individuals cannot cease to be possible. The converse Barcan formula is taken to be more plausible than the Barcan formula.
In mathematics a field of sets is a pair where X is a set and is an algebra over X i.e., a non-empty subset of the power set of X closed under the intersection and union of pairs of sets and under complements of individual sets. In other words forms a subalgebra of the power set Boolean algebra of X. (Many authors refer to itself as a field of sets.) Elements of X are called points and those of are called complexes.
Fields of sets play an essential role in the representation theory of Boolean algebras. Every Boolean algebra can be represented as a field of sets.
Fields of sets in the representation theory of Boolean algebras
[edit]Stone representation
Every finite Boolean algebra can be represented as a whole power set - the power set of its set of atoms; each element of the Boolean algebra corresponds to the set of atoms below it (the join of which is the element). This power set representation can be constructed more generally for any complete atomic Boolean algebra.
In the case of Boolean algebras which are not complete and atomic we can still generalize the power set representation by considering fields of sets instead of whole power sets. To do this we first observe that the atoms of a finite Boolean algebra correspond to its ultrafilters and that an atom is below an element of a finite Boolean algebra if and only if that element is contained in the ultrafilter corresponding to the atom. This leads us to construct a representation of a Boolean algebra by taking its set of ultrafilters and forming complexes by associating with each element of the Boolean algebra the set of ultrafilters containing that element. This construction does indeed produce a representation of the Boolean algebra as a field of sets and is known as the Stone representation. It is the basis of Stone's representation theorem for Boolean algebras and an example of a completion procedure in order theory based on ideals or filters, similar to Dedekind cuts.
Alternatively one can consider the set of homomorphisms onto the two element Boolean algebra and form complexes by associating each element of the Boolean algebra with the set of such homomorphisms that map it to the top element. (The approach is equivalent as the ultrafilters of a Boolean algebra are precisely the pre-images of the top elements under these homomorphisms.) With this approach one sees that Stone representation can also be regarded as a generalization of the representation of finite Boolean algebras by truth tables.
[edit]Separative and compact fields of sets: towards Stone duality
A field of sets is called separative if and only if for every pair of distinct points there is a complex containing one and not the other.
A field of sets is called compact if and only if for every proper filter over the intersection of all the complexes contained in the filter is non-empty.
These definitions arise from considering the topology generated by the complexes of a field of sets. Given a field of sets the complexes form a base for a topology, we denote the corresponding topological space by . Then
is always a zero-dimensional space.
is a Hausdorff space if and only if is separative.
is a compact space with compact open sets if and only if is compact.
is a Boolean space with clopen sets if and only if is both separative and compact.
The Stone representation of a Boolean algebra is always separative and compact; the corresponding Boolean space is known as the Stone space of the Boolean algebra. The clopen sets of the Stone space are then precisely the complexes of the Stone representation. The area of mathematics known as Stone duality is founded on the fact that the Stone representation of a Boolean algebra can be recovered purely from the corresponding Stone space whence a duality exists between Boolean algebras and Boolean spaces.
[edit]Fields of sets with additional structure
[edit]Sigma algebras and measure spaces
If an algebra over a set is closed under countable intersections and countable unions, it is called a sigma algebra and the corresponding field of sets is called a measurable space. The complexes of a measurable space are called measurable sets.
A measure space is a triple where is a measurable space and μ is a measure defined on it. If μ is in fact a probability measure we speak of a probability space and call its underlying measurable space a sample space. The points of a sample space are called samples and represent potential outcomes while the measurable sets (complexes) are called events and represent properties of outcomes for which we wish to assign probabilities. (Many use the term sample space simply for the underlying set of a probability space, particularly in the case where every subset is an event.) Measure spaces and probability spaces play a foundational role in measure theory and probability theory respectively.
The Loomis-Sikorski theorem provides a Stone-type duality between abstract sigma algebras and measurable spaces.
[edit]Topological fields of sets
A topological field of sets is a triple where is a topological space and is a field of sets which is closed under the closure operator of or equivalently under the interior operator i.e. the closure and interior of every complex is also a complex. In other words forms a subalgebra of the power set interior algebra on .
Every interior algebra can be represented as a topological field of sets with its interior and closure operators corresponding to those of the topological space.
Given a topological space the clopen sets trivially form a topological field of sets as each clopen set is its own interior and closure. The Stone representation of a Boolean algebra can be regarded as such a topological field of sets.
[edit]Algebraic fields of sets and Stone fields
A topological field of sets is called algebraic if and only if there is a base for its topology consisting of complexes.
If a topological field of sets is both compact and algebraic then its topology is compact and its compact open sets are precisely the open complexes. Moreover the open complexes form a base for the topology.
Topological fields of sets that are separative, compact and algebraic are called Stone fields and provide a generalization of the Stone representation of Boolean algebras. Given an interior algebra we can form the Stone representation of its underlying Boolean algebra and then extend this to a topological field of sets by taking the topology generated by the complexes corresponding to the open elements of the interior algebra (which form a base for a topology). These complexes are then precisely the open complexes and the construction produces a Stone field representing the interior algebra - the Stone representation.
Preorder fields
A preorder field is a triple where is a preordered set and is a field of sets.
Like the topological fields of sets, preorder fields play an important role in the representation theory of interior algebras. Every interior algebra can be represented as a preorder field with its interior and closure operators corresponding to those of the Alexandrov topology induced by the preorder. In other words
there exists a with and
there exists a with for all
Preorder fields arise naturally in modal logic where the points represent the possible worlds in the Kripke semantics of a theory in the modal logic S4 (a formal mathematical abstraction of epistemic logic), the preorder represents the accessibility relation on these possible worlds in this semantics, and the complexes represent sets of possible worlds in which individual sentences in the theory hold, providing a representation of the Lindenbaum-Tarski algebra of the theory.
Algebraic and canonical preorder fields
A preorder field is called algebraic if and only if it has a set of complexes which determines the preorder in the following manner: if and only if for every complex , implies . The preorder fields obtained from S4 theories are always algebraic, the complexes determining the preorder being the sets of possible worlds in which the sentences of the theory closed under necessity hold.
A separative compact algebraic preorder field is said to be canonical. Given an interior algebra, by replacing the topology of its Stone representation with the corresponding canonical preorder (specialization preorder) we obtain a representation of the interior algebra as a canonical preorder field. By replacing the preorder by its corresponding Alexandrov topology we obtain an alternative representation of the interior algebra as a topological field of sets. (The topology of this "Alexandrov representation" is just the Alexandrov bi-coreflection of the topology of the Stone representation.)
Complex algebras and fields of sets on relational structures
The representation of interior algebras by preorder fields can be generalized to a representation theorem for arbitrary (normal) Boolean algebras with operators. For this we consider structures where is a relational structure i.e. a set with an indexed family of relations defined on it, and is a field of sets. The complex algebra (or algebra of complexes) determined by a field of sets on a relational structure, is the Boolean algebra with operators
where for all , if is a relation of arity n + 1, then is an operator of arity n and for all
there exist such that
This construction can be generalized to fields of sets on arbitrary algebraic structures having both operators and relations as operators can be viewed as a special case of relations. If is the whole power set of then is called a full complex algebra or power algebra.
Every (normal) Boolean algebra with operators can be represented as a field of sets on a relational structure in the sense that it is isomorphic to the complex algebra corresponding to the field.
(Historically the term complex was first used in the case where the algebraic structure was a group and has its origins in 19th century group theory where a subset of a group was called a complex.)
The algebra of sets develops and describes the basic properties and laws of sets, the set-theoretic operations of union, intersection, and complementation and the relations of set equality (mathematics) and set inclusion. It also provides systematic procedures for evaluating expressions, and performing calculations, involving these operations and relations.
Introduction
The algebra of sets is the development of the fundamental properties of set operations and set relations. These properties provide insight into the fundamental nature of sets. They also have practical considerations.
Just like expressions and calculations in ordinary arithmetic, expressions and calculations involving sets can be quite complex. It is helpful to have systematic procedures available for manipulating and evaluating such expressions and performing such computations.
In the case of arithmetic, it is elementary algebra that develops the fundamental properties of arithmetic operations and relations.
For example, the operations of addition and multiplication obey familiar laws such as associativity, commutativity and distributivity, while, the "less than or equal" relation satisfies such laws as reflexivity, antisymmetry and transitivity. These laws provide tools which facilitate computation, as well as describe the fundamental nature of numbers, their operations and relations.
The algebra of sets is the set-theoretic analogue of the algebra of numbers. It is the algebra of the set-theoretic operations of union, intersection and complementation, and the relations of equality and inclusion. These are the topics covered in this article. For a basic introduction to sets see the article on sets, for a fuller account see naive set theory, and for a full rigorous axiomatic treatment see axiomatic set theory.
The fundamental laws of set algebra
The binary operations of set union and intersection satisfy many identities. Several of these identities or "laws" have well established names. Three pairs of laws, are stated, without proof, in the following proposition.
PROPOSITION 1: For any sets A, B, and C, the following identities hold:
commutative laws:
associative laws:
distributive laws:
Notice that the analogy between unions and intersections of sets, and addition and multiplication of numbers, is quite striking. Like addition and multiplication, the operations of union and intersection are commutative and associative, and intersection distributes over unions. However, unlike addition and multiplication, union also distributes over intersection.
The next proposition, states two additional pairs of laws involving three specials sets: the empty set, the universal set and the complement of a set.
PROPOSITION 2: For any subset A of universal set U, where Ø is the empty set, the following identities hold:
identity laws:
complement laws:
The identity laws (together with the commutative laws) say that, just like 0 and 1 for addition and multiplication, Ø and U are the identity elements for union and intersection, respectively.
Unlike addition and multiplication, union and intersection do not have inverse elements. However the complement laws give the fundamental properties of the somewhat inverse-like unary operation of set complementation.
The preceding five pairs of laws: the commutative, associative, distributive, identity and complement laws, can be said to encompass all of set algebra, in the sense that every valid proposition in the algebra of sets can be derived from them.
The principle of duality
The above propositions display the following interesting pattern. Each of the identities stated above is one of a pair of identities, such that, each can be transformed into the other by interchanging ∪ and ∩, and also Ø and U.
These are examples of an extremely important and powerful property of set algebra, namely, the principle of duality for sets, which asserts that for any true statement about sets, the dual statement obtained by interchanging unions and intersections, interchanging U and Ø and reversing inclusions is also true. A statement is said to be self-dual if it is equal to its own dual.
Some additional laws for unions and intersections
The following proposition states six more important laws of set algebra, involving unions and intersections.
PROPOSITION 3: For any subsets A and B of a universal set U, the following identities hold:
idempotent laws:
domination laws:
absorption laws:
As noted above each of the laws stated in proposition 3, can be derived from the five fundamental pairs of laws stated in proposition 1 and proposition 2. As an illustration, a proof is given below for the idempotent law for union.
Proof:
by the identity law of intersection
by the complement law for union
by the distributive law of union over intersection
by the complement law for intersection
by the identity law for union
The following proof illustrates that the dual of the above proof is the proof of the dual of the idempotent law for union, namely the idempotent law for intersection.
Proof:
by the identity law for union
by the complement law for intersection
by the distributive law of intersection over union
by the complement law for union
by the identity law for intersection
Some additional laws for complements
The following proposition states five more important laws of set algebra, involving complements.
PROPOSITION 4: Let A and B be subsets of a universe U, then:
De Morgan's laws:
double complement or Involution law:
complement laws for the universal set and the empty set:
Notice that the double complement law is self-dual.
The next proposition, which is also self-dual, says that the complement of a set is the only set that satisfies the complement laws. In other words, complementation is characterized by the complement laws.
PROPOSITION 5: Let A and B be subsets of a universe U, then:
uniqueness of complements:
If , and , then
The algebra of inclusion
The following proposition says that inclusion is a partial order.
PROPOSITION 6: If A, B and C are sets then the following hold:
reflexivity:
antisymmetry:
and if and only if
transitivity:
If and , then
The following proposition says that for any set S, the power set of S, ordered by inclusion, is a bounded lattice, and hence together with the distributive and complement laws above, show that it is a Boolean algebra.
PROPOSITION 7: If A, B and C are subsets of a set S then the following hold:
existence of a least element and a greatest element:
existence of joins:
If and , then
existence of meets:
If and , then
The following proposition says that the statement is equivalent to various other statements involving unions, intersections and complements.
PROPOSITION 8: For any two sets A and B, the following are equivalent:
The above proposition shows that the relation of set inclusion can be characterized by either of the operations of set union or set intersection, which means that the notion of set inclusion is axiomatically superfluous.
The algebra of relative complements
The following proposition lists several identities concerning relative complements or set-theoretic difference.
PROPOSITION 9: For any universe U and subsets A, B, and C of U, the following identities hold:
In abstract algebra, a Boolean algebra or Boolean lattice is a complemented distributive lattice. This type of algebraic structure captures essential properties of both set operations and logic operations. A Boolean algebra can be seen as a generalization of a power set algebra or a field of sets.
Definition
A Boolean algebra is a six-tuple consisting of a set A, equipped with two binary operations (called "meet" or "and"), (called "join" or "or"), a unary operation (called "complement" or "not") and two elements 0 and 1 (sometimes denoted by and ), such that for all elements a, b and c of A, the following axioms hold:
associativity
commutativity
absorption
distributivity
complements
A Boolean algebra with only one element is called a trivial Boolean algebra or a degenerate Boolean algebra. (Some authors require 0 and 1 to be distinct elements in order to exclude this case.)
It follows from the first three pairs of axioms above (associativity, commutativity and absorption) that
if and only if
The relation ≤ defined by a ≤ b if and only if the above equivalent conditions hold, is a partial order with least element 0 and greatest element 1. The meet and the join of two elements coincide with their infimum and supremum, respectively, with respect to ≤.
As in every lattice, the relations and satisfy the first three pairs of axioms above; the fourth pair is just distributivity. Since the complements in a distributive lattice are unique, to define the involution it suffices to define as the complement of a.
The set of axioms is self-dual in the sense that if one exchanges with and 0 with 1 in an axiom, the result is again an axiom. Therefore by applying this operation to a Boolean algebra (or Boolean lattice), one obtains another Boolean algebra with the same elements; it is called its dual.
Examples
The simplest non-trivial Boolean algebra has only two elements, 0 and 1, and is defined by the rules:
0
1
0
0
0
1
0
1
0
1
0
0
1
1
1
1
a
0
1
¬a
1
0
It has applications in logic, interpreting 0 as false, 1 as true, as and, as or, and as not. Expressions involving variables and the Boolean operations represent statement forms, and two such expressions can be shown to be equal using the above axioms if and only if the corresponding statement forms are logically equivalent.
The two-element Boolean algebra is also used for circuit design in electrical engineering; here 0 and 1 represent the two different states of one bit in a digital circuit, typically high and low voltage. Circuits are described by expressions containing variables, and two such expressions are equal for all values of the variables if and only if the corresponding circuits have the same input-output behavior. Furthermore, every possible input-output behavior can be modeled by a suitable Boolean expression.
The two-element Boolean algebra is also important in the general theory of Boolean algebras, because an equation involving several variables is generally true in all Boolean algebras if and only if it is true in the two-element Boolean algebra (which can be checked by a trivial brute force algorithm for small numbers of variables). This can for example be used to show that the following laws (Consensus theorems) are generally valid in all Boolean algebras:
The power set (set of all subsets) of any given nonempty set S forms a Boolean algebra with the two operations (union) and (intersection). The smallest element 0 is the empty set and the largest element 1 is the set S itself.
The set of all subsets of S that are either finite or cofinite is a Boolean algebra.
Starting with the propositional calculus with κ sentence symbols, form the Lindenbaum algebra (that is, the set of sentences in the propositional calculus modulo tautology). This construction yields a Boolean algebra. It is in fact the free Boolean algebra on κ generators. A truth assignment in propositional calculus is then a Boolean algebra homomorphism from this algebra to {0,1}.
Given any linearly ordered set with a least element, the interval algebra is the smallest algebra of subsets of L containing all of the half-open intervals [a, b) such that a is in L and b in either in L or equal to ∞. Interval algebras are useful in the study of Lindenbaum-Tarski algebras; every countable BA is isomorphic to an interval algebra.
For any natural number n, the set of all positive divisors of n forms a distributive lattice if we write a ≤ b for a | b. This lattice is a Boolean algebra if and only if n is square-free. The smallest element 0 of this Boolean algebra is the natural number 1; the largest element 1 of this Boolean algebra is the natural number n.
Other examples of Boolean algebras arise from topological spaces: if X is a topological space, then the collection of all subsets of X which are both open and closed forms a Boolean algebra with the operations (union) and (intersection).
If R is an arbitrary ring and we define the set of central idempotents by A = { e ∈ R : e2 = e, ex = xe, ∀x ∈ R } then the set A becomes a Boolean algebra with the operations and .
Homomorphisms and isomorphisms
A homomorphism between two Boolean algebras A and B is a function f : A → B such that for all a, b in A:
It then follows that for all a in A as well. The class of all Boolean algebras, together with this notion of morphism, forms a full subcategory of the category of lattices.
Boolean rings, ideals and filters
Every Boolean algebra gives rise to a ring (A, +, *) by defining (this operation is called symmetric difference in the case of sets and XOR in the case of logic) and . The zero element of this ring coincides with the 0 of the Boolean algebra; the multiplicative identity element of the ring is the 1 of the Boolean algebra. This ring has the property that a * a = a for all a in A; rings with this property are called Boolean rings.
Conversely, if a Boolean ring A is given, we can turn it into a Boolean algebra by defining and . Since these two constructions are inverses of each other, we can say that every Boolean ring arises from a Boolean algebra, and vice versa. Furthermore, a map f : A → B is a homomorphism of Boolean algebras if and only if it is a homomorphism of Boolean rings. The categories of Boolean rings and Boolean algebras are equivalent.
An ideal of the Boolean algebra A is a subset I such that for all x, y in I we have in I and for all a in A we have in I. This notion of ideal coincides with the notion of ring ideal in the Boolean ring A. An ideal I of A is called prime if I ≠ A and if in I always implies a in I or b in I. An ideal I of A is called maximal if I ≠ A and if the only ideal properly containing I is A itself. These notions coincide with ring theoretic ones of prime ideal and maximal ideal in the Boolean ring A.
The dual of an ideal is a filter. A filter of the Boolean algebra A is a subset p such that for all x, y in p we have in p and for all a in A we have in p.
Representing Boolean algebras
It can be shown that every finite Boolean algebra is isomorphic to the Boolean algebra of all subsets of a finite set. Therefore, the number of elements of every finite Boolean algebra is a power of two.
Stone's celebrated representation theorem for Boolean algebras states that every Boolean algebra A is isomorphic to the Boolean algebra of all clopen sets in some (compact totally disconnected Hausdorff) topological space.
Axiomatics for Boolean algebras
In 1933, the American mathematician Edward Vermilye Huntington (1874–1952) set out the following elegant axiomatization for Boolean algebra. It requires just one binary operation + and a unary functional symboln, to be read as 'complement', which satisfy the following laws:
do (1), (2), and (4) form a basis for Boolean algebra? Calling (1), (2), and (4) a Robbins algebra, the question then becomes: Is every Robbins algebra a Boolean algebra? This question (which came to be known as the Robbins conjecture) remained open for decades, and became a favorite question of Alfred Tarski and his students.
In 1996, William McCune at Argonne National Laboratory, building on earlier work by Larry Wos, Steve Winker, and Bob Veroff, answered Robbins's question in the affirmative: Every Robbins algebra is a Boolean algebra. Crucial to McCune's proof was the automated reasoning program EQP he designed. For a simplification of McCune's proof, see Dahn (1998).
Generalizations
Removing the requirement of existence of a unit from the axioms of Boolean algebra yields "generalized Boolean algebras". Formally, a distributive lattice B is a generalized Boolean lattice, if it has a smallest element 0 and for any elements a and b in B such that a ≤ b, there exists an element x such that and . Defining as the unique x such that and , we say that the structure is a generalized Boolean algebra, while is a generalized Boolean semilattice. Generalized Boolean lattices are exactly the ideals of Boolean lattices.
A structure that satisfies all axioms for Boolean algebras except the two distributivity axioms is called an orthocomplemented lattice. Orthocomplemented lattices arise naturally in quantum logic as lattices of closed subspaces for separable Hilbert spaces.
Boolean algebra (or Boolean logic) is a logical calculus of truth values, developed by George Boole in the 1840s. It resembles the algebra of real numbers, but with the numeric operations of multiplication xy, addition x + y, and negation −x replaced by the respective logical operations of conjunction x∧y, disjunction x∨y, and complement ¬x. The Boolean operations are these and all other operations that can be built from these, such as x∧(y∨z). These turn out to coincide with the set of all operations on the set {0,1} that take only finitely many arguments; there are 22n such operations when there are n arguments.
The laws of Boolean algebra can be defined axiomatically as certain equations called axioms together with their logical consequences called theorems, or semantically as those equations that are true for every possible assignment of 0 or 1 to their variables. The axiomatic approach is sound and complete in the sense that it proves respectively neither more nor fewer laws than the semantic approach.
Values
Boolean algebra is the algebra of two values. These are usually taken to be 0 and 1, as we shall do here, although F and T, false and true, etc. are also in common use. For the purpose of understanding Boolean algebra any Boolean domain of two values will do.
Regardless of nomenclature, the values are customarily thought of as essentially logical in character and are therefore referred to as truth values, in contrast to the natural numbers or the reals which are considered numerical values. On the other hand the algebra of the integers modulo 2, while ostensibly just as numeric as the integers themselves, was shown to constitute exactly Boolean algebra, originally by I.I. Zhegalkin in 1927 and rediscovered independently in the west by Marshall Stone in 1936. So in fact there is some ambiguity in the true nature of Boolean algebra: it can be viewed as either logical or numeric in character.
More generally Boolean algebra is the algebra of values from any Boolean algebra as a model of the laws of Boolean algebra. For example the bit vectors of a given length, as with say 32-bit computer words, can be combined with Boolean operations in the same way as individual bits, thereby forming a 232-element Boolean algebra under those operations. Any such combination applies the same Boolean operation to all bits simultaneously. This passage from the Boolean algebra of 0 and 1 to these more general Boolean algebras is the Boolean counterpart of the passage from the algebra of the ring of integers to the algebra of commutative rings in general. The two-element Boolean algebra is the prototypical Boolean algebra in the same sense as the ring of integers is the prototypical commutative ring. Boolean logic as the subject matter of this article is independent of the choice of Boolean algebra (the same equations hold of every nontrivial Boolean algebra); hence, there is no need here to consider any Boolean algebra other than the two-element one. The article on Boolean algebra (structure) treats Boolean algebras themselves.
Operations
Basic operations
After values, the next ingredient of any algebraic system is its operations. Whereas elementary algebra is based on numeric operations multiplication xy, addition x + y, and negation −x, Boolean algebra is customarily based on logical counterparts to those operations, namely conjunction x∧y (AND), disjunction x∨y (OR), and complement or negation ¬x (NOT). In electronics, the AND is represented as a multiplication, the OR is represented as an addition, and the NOT is represented with an overbar: x ∧ y and x ∨ y, therefore, become xy and x + y.
Conjunction is the closest of these three to its numerical counterpart, in fact on 0 and 1 it is multiplication. As a logical operation the conjunction of two propositions is true when both propositions are true, and otherwise is false. The first column of Figure 1 below tabulates the values of x∧y for the four possible valuations for x and y; such a tabulation is traditionally called a truth table.
Disjunction, in the second column of the figures, works almost like addition, with one exception: the disjunction of 1 and 1 is neither 2 nor 0 but 1. Thus the disjunction of two propositions is false when both propositions are false, and otherwise is true. This is just the definition of conjunction with true and false interchanged everywhere; because of this we say that disjunction is the dual of conjunction.
Logical negation however does not work like numerical negation at all. Instead it corresponds to incrementation: ¬x = x+1 mod 2. Yet it shares in common with numerical negation the property that applying it twice returns the original value: ¬¬x = x, just as −(−x) = x. An operation with this property is called an involution. The set {0,1} has two permutations, both involutary, namely the identity, no movement, corresponding to numerical negation mod 2 (since +1 = −1 mod 2), and SWAP, corresponding to logical negation. Using negation we can formalize the notion that conjunction is dual to disjunction via De Morgan's laws, ¬(x∧y) = ¬x ∨ ¬y and ¬(x∨y) = ¬x ∧ ¬y. These can also be construed as definitions of conjunction in terms of disjunction and vice versa: x∧y = ¬(¬x ∨ ¬y) and x∨y = ¬(¬x ∧ ¬y).
Various representations of Boolean operations
Figure 2 shows the symbols used in digital electronics for conjunction and disjunction; the input ports are on the left and the signals flow through to the output port on the right. Inverters negating the input signals on the way in, or the output signals on the way out, are represented as circles on the port to be inverted.
Derived operations
Other Boolean operations are derivable from these by composition. For example implication x→y (IMP), in the third column of the figures, is a binary operation which is false when x is true and y is false, and true otherwise. It can be expressed as x→y = ¬x∨y (the OR-gate of Figure 2 with the x input inverted), or equivalently ¬(x∧¬y) (its De Morgan equivalent in Figure 3). In logic this operation is called material implication, to distinguish it from related but non-Boolean logical concepts such as entailment and relevant implication. The idea is that an implication x→y is by default true (the weaker truth value in the sense that false implies true but not vice versa) unless its premise or antecedent x holds, in which case the truth of the implication is that of its conclusion or consequent y.
Although disjunction is not the exact counterpart of numerical addition, Boolean algebra nonetheless does have an exact counterpart, called exclusive-or (XOR) or parity, x⊕y. As shown in the fourth column of the figures, the exclusive-or of two propositions is true just when exactly one of the propositions is true; equivalently when an odd number of the propositions is true, whence the name "parity". Exclusive-or is the operation of addition mod 2. The exclusive-or of any value with itself vanishes, x⊕x = 0, since the arguments have an even number of whatever value x has. Its digital electronics symbol is shown in Figure 2, being a hybrid of the disjunction symbol and the equality symbol. The latter reflects the fact that the negation (which is also the dual) of XOR, ¬(x⊕y), is logical equivalence, EQV, being true just when x and y are equal, either both true or both false. XOR and EQV are the only binary Boolean operations that are commutative and whose truth tables have equally many 0s and 1s. Exclusive-or together with conjunction constitute yet another complete basis for Boolean algebra, with the Boolean operations reformulated as the Zhegalkin polynomials.
Another example is Sheffer stroke, x|y, the NAND gate in digital electronics, which is false when both arguments are true, and true otherwise. NAND is definable by composition of negation with conjunction as x |y = ¬(x∧y). It does not have its own schematic symbol as it is easily represented as an AND gate with an inverted output. Unlike conjunction and disjunction, NAND is a binary operation that can be used to obtain negation, via the definition ¬x = x|x. With negation in hand one can then in turn define conjunction in terms of NAND via x∧y = ¬(x|y), from which all other Boolean operations of nonzero arity can then be obtained. NOR, ¬(x∨y), as the evident dual of NAND serves this purpose equally well. This universal character of NAND and NOR makes them a popular choice for gate arrays, integrated circuits with multiple general-purpose gates.
The above-mentioned duality of conjunction and disjunction is exposed further by De Morgan's laws, ¬(x∧y) = ¬x∨¬y and ¬(x∨y) = ¬x∧¬y. Figure 3 illustrates De Morgan's laws by giving for each gate its De Morgan dual, converted back to the original operation with inverters on both inputs and the outputs. In the case of implication, taking the form of an OR-gate with one inverter on disjunction, that inverter is cancelled by the second inverter that would have gone there. The De Morgan dual of XOR is just XOR with an inverter on the output (there is no separate symbol); as with implication, putting inverters on all three ports cancels the dual's output inverter. More generally, changing an odd number of inverters on an XOR gate produces the dual gate, an even number leaves the gate's functionality unchanged.
As with all the other laws in this section, De Morgan's laws may be verified case by case for each of the 2n possible valuations of the n variables occurring in the law, here two variables and hence 22 = 4 valuations. De Morgan's laws play a role in putting Boolean terms in certain normal forms, one of which we will encounter later in the section on soundness and completeness.
Figure 4 illustrates the corresponding Venn diagrams for each of the four operations presented in Figures 1-3. The interior (respectively exterior) of each circle represents the value true (respectively false) for the corresponding input, x or y. The convention followed here is to represent the true or 1 outputs as dark regions and false as light, but the reverse convention is also sometimes used.
All Boolean operations
There are infinitely many expressions that can be built from two variables using the above operations, suggesting great expressiveness. Yet a straightforward counting argument shows that only 16 distinct binary operations on two values are possible. Any given binary operation is fully determined by its output values for each possible combination of input values. The two arguments have 2 × 2 = 4 possible combinations of values between them, and there are 24 = 16 ways of assigning an output value to each of these four input values. The choice of one of these 16 assignments then determines the operation; so all together there are only 16 distinct binary operations.
The 16 binary Boolean operations can be organized as follows:
Two constant operations, 0 and 1.
Four operations dependent on one variable, namely x, ¬x, y, and ¬y, whose truth tables amount to two juxtaposed rectangles, one containing two 1s and the other two 0s.
Two operations with a "checkerboard" truth table, namely XOR and EQV.
Four operations are obtained from disjunction with some subset of its inputs negated, namely x∨y, x→y, y→x, and x|y; their truth tables contain a single 0.
The final four come from the same treatment applied to conjunction, having a single 1 in their truth tables.
10 of the 16 operations depend on both variables; all are representable schematically as an AND-gate, an OR-gate, or an XOR-gate, with one port optionally inverted. For the AND and OR gates the location of each inverter matters, for the XOR gate it does not, only whether there is an even or odd number of inverters.
Operations of other arities are possible. For example the ternary counterpart of disjunction can be obtained as (x∨y)∨z. In general an n-ary operation, one having n inputs, has 2n possible valuations of those inputs. An operation has two possibilities for each of these, whence there exist 22nn-ary Boolean operations. For example, there are 232 = 4,294,967,296 operations with 5 inputs.
Although Boolean algebra confines attention to operations that return a single bit, the concept generalizes to operations that take n bits in and return x bits instead of one bit. Digital circuit designers draw such operations as suitably shaped boxes with n wires entering on the left and m wires exiting on the right. Such multi-output operations can be understood simply as mn-ary operations. The operation count must then be raised to the m-th power, or, in the case of n inputs, (22n)m = 2m2n operations. The number of Boolean operations of this generalized kind with say 5 inputs and 5 outputs is 1.46 × 1048. A logic gate or computer module mapping 32 bits to 32 bits could implement any of 5.47 × 1041,373,247,567 operations, more than is obtained by squaring a googol 28 times.
Laws
Axioms
With values and operations in hand, the next aspect of Boolean algebra is that of laws or properties. As with many kinds of algebra, the principal laws take the form of equations between terms built up from variables using the operations of the algebra. Such an equation is deemed a law or identity just when both sides have the same value for all values of the variables, equivalently when the two terms denote the same operation.
Numeric algebra has laws such as commutativity of addition and multiplication, x + y = y + x and xy = yx. Similarly, Boolean algebra has commutativity in that x ∨ y = y ∨ x for disjunction and x ∧ y = y ∧ x for conjunction. Not all binary operations are commutative; Boolean implication is not commutative, like subtraction and division.
Another equally fundamental law is associativity, which in the case of numeric multiplication is expressed as x(yz) = (xy)z, justifying abbreviating both sides to xyz and thinking of multiplication as a single ternary operation. All four of numeric addition and multiplication and logical disjunction and conjunction are associative, giving for the latter two the Boolean laws x ∨ (y ∨ z) = (x ∨ y) ∨ z and x ∧ (y ∧ z) = (x ∧ y) ∧ z.
Again numeric subtraction and logical implication serve as examples, this time of binary operations that are not associative. On the other hand exclusive-or, being just addition mod 2, is both commutative and associative.
Boolean algebra does not completely mirror numeric algebra however, as both conjunction and disjunction satisfy idempotence, expressed respectively as x ∧ x = x and x ∨ x = x. These laws are easily verified by considering the two valuations 0 and 1 for x. But since 2 + 2 = 2 × 2 = 4 in arithmetic, clearly numeric addition and multiplication are not idempotent. With arithmetic mod 2 on the other hand, multiplication is idempotent, though not addition since 1 + 1 = 0 mod 2, reflected logically in the idempotence of conjunction but not of exclusive-or.
A more subtle difference between number and logic is with x(x + y) and x + xy, neither of which equal x numerically. In Boolean algebra however, both x ∧ (x ∨ y) and x ∨ (x ∧ y) are equal to x, as can be verified for each of the four possible valuations for x and y. These two Boolean laws are called the laws of absorption. These laws (both are needed) together with the associativity, commutativity, and idempotence of conjunction and disjunction constitute the defining laws or axioms of lattice theory. (Actually idempotence can be derived from the other axioms.)
Another law common to numbers and truth values is distributivity of multiplication over addition, when paired with distributivity of conjunction over disjunction. Numerically we have x(y + z) = xy + xz, whose Boolean algebra counterpart is x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z). On the other hand Boolean algebra also has distributivity of disjunction over conjunction, x ∨ (y ∧ z) = (x ∨ y) ∧ (x ∨ z), for which there is no numeric counterpart, consider 1 + 2 × 3 = 7 whereas (1 + 2) × (1 + 3) = 12. Like associativity, distributivity has three variables and so requires checking 23 = 8 cases.
Either distributivity law for Boolean algebra entails the other. Adding either to the axioms for lattices axiomatizes the theory of distributive lattices. That theory does not need the idempotence axioms because they follow from the six absorption, distributivity, and associativity laws.
Two Boolean laws having no numeric counterpart are the laws characterizing logical negation, namely x ∧ ¬x = 0 and x ∨ ¬x = 1. These are the only laws thus far that have required constants. It then follows that x ∧ 0 = x ∧ (x ∧ ¬x) = (x ∧ x) ∧ ¬x = x ∧ ¬x = 0, showing that 0 works with conjunction in logic just as it does with multiplication of numbers. Also x ∨ 0 = x ∨ (x ∧ ¬x) = x by absorption. Dualizing this reasoning, we obtain x ∨ 1 = 1 and x ∧ 1 = x. Alternatively we can justify these laws more directly simply by checking them for each of the two valuations of x.
The six laws of lattice theory along with these first two laws for negation axiomatize the theory of complemented lattices. Including either distributivity law then axiomatizes the theory of complemented distributive lattices. For convenience we collect these nine laws in one place as follows.
associativity
commutativity
absorption
distributivity
complements
The next two sections show that this theory is sufficient to axiomatize all the valid laws or identities of two-valued logic, that is, Boolean algebra. It follows that Boolean algebra as commonly defined in terms of these axioms coincides with the intuitive semantic notion of the valid identities of two-valued logic.
Derivations
While the Boolean laws enumerated in the previous section are certainly highlights of Boolean algebra, they by no means exhaust the laws, of which there are infinitely many, nor do they even exhaust the highlights. As it is out of the question to proceed in the ad hoc way of the preceding section for ever, the question arises as to how best to present the remaining laws.
One way of establishing an equation as being a law is to verify its truth for all valuations of its variables, sometimes called the method of truth tables. This is the method we depended on in the previous section to justify each law as we introduced it, constituting the semantic approach to establishing laws. From a practical standpoint the method lends itself to computer implementation for 20-30 variables because the enumeration of valuations is straightforward to program and boring to carry out, making it ideal work for a computer. Because there are 2n valuations to check the method starts to become impractical as 40 variables is approached. Beyond that the approach becomes of value mainly as the in-principle semantic definition of what constitutes an identically true or valid equation.
In contrast the syntactic approach is to derive new laws by symbolic manipulation from already established laws such as those listed in the previous section. (This is not to imply that derivations of a law shorter than the length of a semantic verification of that law need exist, although some thousand-variable laws impossible to verify by enumeration of valuations can have quite short derivations.) Here is an example showing the derivation of (w∨x)∨(y∨z) = (w∨y)∨(x∨z) from just the commutativity and associativity of disjunction.
The first two and last two steps appealed to associativity while the middle step used commutativity.
The rules of derivation for forming new laws from old can be assumed to be those permissible in high school algebra. For definiteness however it is worthwhile formulating a well-defined set of rules showing exactly what is needed. These are the domain-independent rules of equational logic, as sound for logic as they are for numerical domains or any other kind.
Reflexivity: t = t. That is, any equation whose two sides are the same term t is a law. (While arguably an axiom rather than a rule since it has no premises, we classify it as a rule because like the other three rules it is domain-independent, making no mention of specific logical, numeric, or other operations.)
Symmetry: From s = t infer t = s. That is, the two sides of a law may be interchanged. Intuitively one attaches no importance to which side of an equation a term comes from.
Transitivity: A chain s = t = u of two laws yields the law s = u. (This law of "cutting out the middleman" is applied four times in the above example to eliminate the intermediate terms.)
Substitution: Given two laws and a variable, each occurrence of that variable in the first law may be replaced by one or the other side of the second law. (Distinct occurrences can be replaced by distinct sides, but every occurrence must be replaced by one or the other side.)
While the first equation in the above example might seem simply a straightforward application of the associativity law, when analyzed more carefully according to the above rules it can be seen to require something more. We can justify it in terms of the reflexivity and substitution rules. Beginning with the laws x∨(y∨z) = (x∨y)∨z and w∨x = w∨x, we use substitution to replace both occurrences of x by w∨x to arrive at the first equation. All five equations in the chain are accounted for along similar lines, with commutativity in place of associativity in the middle equation.
Soundness and completeness
It can be shown that the two approaches, semantic and syntactic, to constructing all the laws of Boolean algebra lead to the same set of laws. We say that the syntactic approach is sound when it yields a subset of the semantically obtained laws, and complete when it yields a superset thereof. We can then restate this coinciding of the semantic and syntactic approaches as the soundness and completeness of the syntactic approach with respect to (or as calibrated by) the semantic approach.
Soundness follows firstly from the fact that the initial laws or axioms we started from were all identities, that is, semantically true laws. Secondly it depends on the easily verified fact that the rules preserve identities.
Completeness can be proved by first deriving a few additional useful laws and then showing how to use the axioms and rules to prove that a term with n variables, ordered alphabetically say, is equal to its n-ary normal form, namely a unique term associated with the n-ary Boolean operation realized by that term with the variables in that order. It then follows that if two terms denote the same operation (the same thing as being semantically equal), they are both provably equal to the normal form term denoting that operation, and hence by transitivity provably equal to each other.
There is more than one suitable choice of normal form, but complete disjunctive normal form will do. A literal is either a variable or a negated variable. A disjunctive normal form (DNF) term is a disjunction of conjunctions of literals. (Associativity allows a term such as x∨(y∨z) to be viewed as the ternary disjunction x∨y∨z, likewise for longer disjunctions, and similarly for conjunction.) A DNF term is complete when every disjunct (conjunction) contains exactly one occurrence of each variable, independently of whether or not the variable is negated. Such a conjunction uniquely represents the operation it denotes by virtue of serving as a coding of those valuations at which the operation returns 1. Each conjunction codes the valuation setting the positively occurring variables to 1 and the negated ones to 0; the value of the conjunction at that valuation is 1, and hence so is the whole term. At valuations corresponding to omitted conjunctions, all conjunctions present in the term evaluate to 0 and hence so does the whole term.
In outline the general technique for converting any term to its normal form, or normalizing it, is to use De Morgan's laws to push the negations down to the variables. This yields monotone normal form, a term built from literals with conjunctions and disjunctions. For example ¬(x ∨ (¬y∧z)) becomes ¬x ∧ ¬(¬y∧z) and then ¬x ∧ (¬¬y∨¬z). Applying ¬¬x = x then yields ¬x ∧ (y∨¬z).
Next use distributivity of conjunction over disjunction to push all conjunctions down below all disjunctions, yielding a DNF term. This makes the above example (¬x∧y) ∨ (¬x∧¬z).
Then for each variable y, replace each conjunction x not containing y with the disjunction of two copies of x, with y conjoined to one copy of x and ¬y conjoined to the other, in the end yielding a complete DNF term. (This is one place where an auxiliary law helps, in this case x = x∧1 = x∧(y∨¬y) = (x∧y) ∨ (x∧¬y).) In the above example the first conjunction lacks z while the second lacks y; expanding appropriately yields the complete DNF term (¬x∧y∧z) ∨ (¬x∧y∧¬z) ∨ (¬x∧¬z∧y) ∨ (¬x∧¬z∧¬y).
Next use commutativity to put the literals in each conjunction in alphabetical order. The example becomes (¬x∧y∧z) ∨ (¬x∧y∧¬z) ∨ (¬x∧y∧¬z) ∨ (¬x∧¬y∧¬z). This brings any repeated copies of literals next to each other; delete the redundant copies using idempotence of conjunction, not needed in our example.
Lastly order the disjuncts according to a suitable uniformly applied criterion. The criterion we use here is to read the positive and negative literals of a conjunction as respectively 1 and 0 bits, and to read the bits in a conjunction as a binary number. In our example the bits are 011, 010, 010, 000, or in decimal 3, 2, 2, 0. Ordering them numerically as 0, 2, 2, 3 yields (¬x∧¬y∧¬z) ∨ (¬x∧y∧¬z) ∨ (¬x∧y∧¬z) ∨ (¬x∧y∧z). Note that these bits are exactly those valuations for x, y, and z that satisfy our original term ¬(x∨(¬y∧z)). Complete DNF amounts to a canonical way of representing the truth table for the original term as another term.
Repeated conjunctions can then be deleted using idempotence of disjunction, which simplifies our example to (¬x∧¬y∧¬z) ∨ (¬x∧y∧¬z) ∨ (¬x∧y∧z).
In this way we have proved that the term we started with is equal to the normal form term for the operation it denotes. Hence all terms denoting that operation are provably equal to the same normal form term and hence by transitivity to each other.
In traditional logic, an axiom or postulate is a proposition that is not proved or demonstrated but considered to be either self-evident, or subject to necessary decision. Therefore, its truth is taken for granted, and serves as a starting point for deducing and inferring other (theory dependent) truths.
In mathematics, the term axiom is used in two related but distinguishable senses: "logical axioms" and "non-logical axioms". In both senses, an axiom is any mathematical statement that serves as a starting point from which other statements are logically derived. Unlike theorems, axioms (unless redundant) cannot be derived by principles of deduction, nor are they demonstrable by mathematical proofs, simply because they are starting points; there is nothing else from which they logically follow (otherwise they would be classified as theorems).
Logical axioms are usually statements that are taken to be universally true (e.g., A and B implies A), while non-logical axioms (e.g., a + b = b + a) are actually defining properties for the domain of a specific mathematical theory (such as arithmetic). When used in the latter sense, "axiom," "postulate", and "assumption" may be used interchangeably. In general, a non-logical axiom is not a self-evident truth, but rather a formal logical expression used in deduction to build a mathematical theory. To axiomatize a system of knowledge is to show that its claims can be derived from a small, well-understood set of sentences (the axioms). There are typically multiple ways to axiomatize a given mathematical domain.
Outside logic and mathematics, the term "axiom" is used loosely for any established principle of some field.
In classical logic, modus ponendo ponens (Latin for the way that affirms by affirming;[1] often abbreviated to MP or modus ponens) is a valid, simple argument form sometimes referred to as affirming the antecedent or the law of detachment. It is closely related to another valid form of argument, modus tollens.
Modus ponens is a very common rule of inference, and takes the following form:
If P, then Q.
P.
Therefore, Q
Formal notation
The modus ponens rule may be written in sequent notation:
or in rule form:
Explanation
The argument form has two premises. The first premise is the "if–then" or conditional claim, namely that P implies Q. The second premise is that P, the antecedent of the conditional claim, is true. From these two premises it can be logically concluded that Q, the consequent of the conditional claim, must be true as well. In Artificial Intelligence, modus ponens is often called forward chaining.
An example of an argument that fits the form modus ponens:
If today is Tuesday, then I will go to work.
Today is Tuesday.
Therefore, I will go to work.
This argument is valid, but this has no bearing on whether any of the statements in the argument are true; for modus ponens to be a sound argument, the premises must be true for any true instances of the conclusion. An argument can be valid but nonetheless unsound if one or more premises are false; if an argument is valid and all the premises are true, then the argument is sound. For example, I might be going to work on Wednesday. In this case, the reasoning for my going to work (because it is Wednesday) is unsound. The argument is only sound on Tuesdays (when I go to work), but valid on every day of the week. A propositional argument using modus ponens is said to be deductive.
In single-conclusion sequent calculi, modus ponens is the Cut rule. The cut-elimination theorem for a calculus says that every proof involving Cut can be transformed (generally, by a constructive method) into a proof without Cut, and hence that Cut is admissible.
The Curry-Howard correspondence between proofs and programs relates modus ponens to function application: if f is a function of type P → Q and x is of type P, then f x is of type Q.
Justification via truth table
The validity of modus ponens in classical two-valued logic can be clearly demonstrated by use of a truth table.
p
q
p → q
T
T
T
T
F
F
F
T
T
F
F
T
In instances of modus ponens we assume as premises that p → q is true and p is true. Only one line of the truth table - the first - satisfies these two conditions. On this line, q is also true. Therefore, whenever p → q is true and p is true, q must also be true.
Affine logic is a substructural logic whose proof theory rejects the structural rule of contraction. It can also be characterized as linear logic with weakening.
The name "affine logic" is associated with linear logic, to which is differs by allowing the weakening rule. Jean-Yves Girard introduced the name as part of the geometry of interaction semantics of linear logic, which characterises linear logic in terms of linear algebra; here he alludes to affine transformations on vector spaces[1].
The logic predated linear logic. V. N. Grishin used this logic in 1974[2], after observing that Russell's paradox cannot be derived in a set theory without contraction, even with an unbounded comprehension axiom[3]. Likewise, the logic formed the basis of a decidable subtheory of predicate logic, called 'Direct logic' (Ketonen & Wehrauch, 1984; Ketonen & Bellin, 1989).
Affine logic can be embedded into linear logic by rewriting the affine arrow as the linear arrow .
Whereas full linear logic (ie. propositional linear logic with multiplicatives, additives and exponentials) is undecidable, full affine logic is decidable.
Abduction is a method of logical inference introduced by Charles Sanders Peirce which comes prior to induction and deduction for which the colloquial name is to have a "hunch". Abductive reasoning starts when an inquirer considers of a set of seemingly unrelated facts, armed with an intuition that they are somehow connected. The term abduction is commonly presumed to mean the same thing as hypothesis; however, an abduction is actually the process of inference that produces a hypothesis as its end result.[1] It is used in both philosophy and computing.
Deduction, induction, and abduction
Deduction
allows deriving b as a consequence of a. In other words, deduction is the process of deriving the consequences of what is assumed. Given the truth of the assumptions, a valid deduction guarantees the truth of the conclusion. It is true by definition and is independent of sense experience. For example, if it is true (given) that the sum of the angles is 180° in all triangles, and if a certain triangle has angles of 90° and 30°, then it can be deduced that the third angle is 60°.
Induction
allows inferring a entails b from multiple instantiations of a and b at the same time. Induction is the process of inferring probable conditional relevance as a result of observing multiple antecedents and consequents. An inductive statement requires empirical evidence for it to be true. For example, the statement "it's snowing, so it must be cold", can be induced from the experience of the two being true together.
Abduction
allows inferring a as an explanation of b. Because of this, abduction allows the precondition a to be inferred from the consequence b. Deduction and abduction thus differ in the direction in which a rule like "a entails b" is used for inference. As such abduction is formally equivalent to the logical fallacy affirming the consequent or Post hoc ergo propter hoc, because there are multiple possible explanations for b.
Unlike deduction and in some sense induction, abduction can produce results that are incorrect within its formal system. Hence the conclusions of abduction can only be made valid by separately checking them with a different method, either by deduction or exhaustive induction. However, it can still be useful as a heuristic, especially when something is known about the likelihood of different causes for b.
Formalizations of Abduction
Logic-based abduction
In logic, explanation is done from a logical theory T representing a domain and a set of observations O. Abduction is the process of deriving a set of explanations of O according to T and picking out one of those explanations. For E to be an explanation of O according to T, it should satisfy two conditions:
O follows from E and T;
E is consistent with T.
In formal logic, O and E are assumed to be sets of literals. The two conditions for E being an explanation of O according to theory T are formalized as:
;
is consistent.
Among the possible explanations E satisfying these two conditions, some other condition of minimality is usually imposed to avoid irrelevant facts (not contributing to the entailment of O) being included in the explanations. Abduction is then the process that picks out some member of E. Criteria for picking out a member representing "the best" explanation include the simplicity, the prior probability, or the explanatory power of the explanation.
A proof theoretical abduction method for first order classical logic based on the sequent calculus and a dual one, based on semantic tableaux (analytic tableaux) have been proposed (Cialdea Mayer & Pirri 1993). The methods are sound and complete and work for full first order logic, without requiring any preliminary reduction of formulae into normal forms. These methods have also been extended to modal logic.
Abductive logic programming is a computational framework that extends normal logic programming with abduction. It separates the theory T into two components, one of which is a normal logic program, used to generate E by means of backward reasoning, the other of which is a set of integrity constraints, used to filter the set of candidate explanations.
Set-cover abduction
A different formalization of abduction is based on inverting the function that calculates the visible effects of the hypotheses. Formally, we are given a set of hypotheses H and a set of manifestations M; they are related by the domain knowledge, represented by a function e that takes as an argument a set of hypotheses and gives as a result the corresponding set of manifestations. In other words, for every subset of the hypotheses , their effects are known to be e(H').
Abduction is performed by finding a set such that . In other words, abduction is performed by finding a set of hypotheses H' such that their effects e(H') include all observations M.
A common assumption is that the effects of the hypotheses are independent, that is, for every , it holds that . If this condition is met, abduction can be seen as a form of set covering.
Abductive validation
Abductive validation is the process of validating a given hypothesis through abductive reasoning. This can also be called reasoning through successive approximation. Under this principle, an explanation is valid if it is the best possible explanation of a set of known data. The best possible explanation is often defined in terms of simplicity and elegance (see Occam's razor). Abductive validation is common practice in hypothesis formation in science; moreover, Peirce argues it is a ubiquitous aspect of thought:
Looking out my window this lovely spring morning I see an azalea in full bloom. No, no! I do not see that; though that is the only way I can describe what I see. That is a proposition, a sentence, a fact; but what I perceive is not proposition, sentence, fact, but only an image, which I make intelligible in part by means of a statement of fact. This statement is abstract; but what I see is concrete. I perform an abduction when I so much as express in a sentence anything I see. The truth is that the whole fabric of our knowledge is one matted felt of pure hypothesis confirmed and refined by induction. Not the smallest advance can be made in knowledge beyond the stage of vacant staring, without making an abduction at every step.[2]
It was Peirce's own maxim that "Facts cannot be explained by a hypothesis more extraordinary than these facts themselves; and of various hypotheses the least extraordinary must be adopted."[3] After obtaining results from an inference procedure, we may be left with multiple assumptions, some of which may be contradictory. Abductive validation is a method for identifying the assumptions that will lead to your goal.
Probabilistic abduction
Probabilistic abductive reasoning is a form of abductive validation, and is used extensively in areas where conclusions about possible hypotheses need to be derived, such as for making diagnoses from medical tests. For example, a pharmaceutical company that develops a test for a particular infectious disease will typically determine the reliability of the test by letting a group of infected and a group of non-infected people undergo the test. Assume the statements x: "Positive test", : "Negative test", y: "Infected", and : "Not infected". The result of these trials will then determine the reliability of the test in terms of its sensitivity p(x | y) and false positive rate . The interpretations of the conditionals are: p(x | y): "The probability of positive test given infection", and : "The probability of positive test in the absence of infection". The problem with applying these conditionals in a practical setting is that they are expressed in the opposite direction to what the practitioner needs. The conditionals needed for making the diagnosis are: p(y | x): "The probability of infection given positive test", and : "The probability of infection given negative test". The probability of infection could then have been conditionally deduced as , where "" denotes conditional deduction. Unfortunately the required conditionals are usually not directly available to the medical practitioner, but they can be obtained if the base rate of the infection in the population is known.
The required conditionals can be correctly derived by inverting the available conditionals using Bayes rule. The inverted conditionals are obtained as follows: The term p(y) on the right hand side of the equation expresses the base rate of the infection in the population. Similarly, the term p(x) expresses the default likelihood of positive test on a random person in the population. In the expressions below a(y) and denote the base rates of y and its complement respectively, so that e.g. . The full expression for the required conditionals p(y | x) and are then:
The full expression for the conditionally abduced probability of infection in a tested person, expressed as , given the outcome of the test, the base rate of the infection, as well as the test's sensitivity and false positive rate, is then given by: .
Probabilistic abduction can thus be described as a method for inverting conditionals in order to apply probabilistic deduction.
A medical test result is typically considered positive or negative, so when applying the above equation it can be assumed that either p(x) = 1 (positive) or (negative). In case the patient tests positive, the above equation can be simplified to which will give the correct likelihood that the patient actually is infected.
The Base rate fallacy in medicine,[4] or the Prosecutor's fallacy[5] in legal reasoning, consists of making the erroneous assumption that p(y | x) = p(x | y). While this reasoning error often can produce a relatively good approximation of the correct hypothesis probability value, it can lead to a completely wrong result and wrong conclusion in case the base rate is very low and the reliability of the test is not perfect. An extreme example of the base rate fallacy is to conclude that a male person is pregnant just because he tests positive in a pregnancy test. Obviously, the base rate of male pregnancy is zero, and assuming that the test is not perfect, it would be correct to conclude that the male person is not pregnant.
The expression for probabilistic abduction can be generalised to multinomial cases,[6] i.e., with a state space X of multiple xi and a state space Y of multiple states yj.
Subjective logic abduction
Subjective logic generalises probabilistic logic by including parameters for uncertainty in the input arguments. Abduction in subjective logic is thus similar to probabilistic abduction described above.[6] The input arguments in subjective logic are composite functions called subjective opinions which can be binomial when the opinion applies to a single proposition or multinomial when it applies to a set of propositions. A multinomial opinion thus applies to a frame (i.e. a state space of exhaustive and mutually disjoint propositions ), and is denoted by the composite function , where is a vector of belief masses over the propositions of , is the uncertainty mass, and is a vector of base rate values over the propositions of . These components satisfy and as well as .
Assume the frames X and Y, the sets of conditional opinions ωX | Y and , the opinion ωX on X, and the base rate function aY on Y. Based on these parameters, subjective logic provides a method for deriving the set of inverted conditionals ωY | X and . Using these inverted conditionals, subjective logic also provides a method for deduction. Abduction in subjective logic consists of inverting the conditionals and then applying deduction.
The symbolic notation for conditional abduction is "", and the operator itself is denoted as . The expression for subjective logic abduction is then:[6].
The advantage of using subjective logic abduction compared to probabilistic abduction is that uncertainty about the probability values of the input arguments can be explicitly expressed and taken into account during the analysis. It is thus possible to perform abductive analysis in the presence of missing or incomplete input evidence, which normally results in degrees of uncertainty in the output conclusions.
Applications
Applications in artificial intelligence include fault diagnosis, belief revision, and automated planning. The most direct application of abduction is that of automatically detecting faults in systems: given a theory relating faults with their effects and a set of observed effects, abduction can be used to derive sets of faults that are likely to be the cause of the problem.
Abduction can also be used to model automated planning.[15] Given a logical theory relating action occurrences with their effects (for example, a formula of the event calculus), the problem of finding a plan for reaching a state can be modeled as the problem of abducting a set of literals implying that the final state is the goal state.
In intelligence analysis, Analysis of Competing Hypotheses and Bayesian networks, probabilistic abductive reasoning is used extensively. Similarly in medical diagnosis and legal reasoning, the same methods are being used, although there have been many examples of errors, especially caused by the base rate fallacy and the prosecutor's fallacy.
Belief revision, the process of adapting beliefs in view of new information, is another field in which abduction has been applied. The main problem of belief revision is that the new information may be inconsistent with the corpus of beliefs, while the result of the incorporation cannot be inconsistent. This process can be done by the use of abduction: once an explanation for the observation has been found, integrating it does not generate inconsistency. This use of abduction is not straightforward, as adding propositional formulae to other propositional formulae can only make inconsistencies worse. Instead, abduction is done at the level of the ordering of preference of the possible worlds. Preference models use fuzzy logic or utility models.
In the philosophy of science, abduction has been the key inference method to support scientific realism, and much of the debate about scientific realism is focused on whether abduction is an acceptable method of inference.
In historical linguistics, abduction during language acquisition is often taken to be an essential part of processes of language change such as reanalysis and analogy.[16]
In anthropology, Alfred Gell in his influential book Art and Agency defined abduction, (after Eco[17]) as “a case of synthetic inference 'where we find some very curious circumstances, which would be explained by the supposition that it was a case of some general rule, and thereupon adopt that supposition”.[18] Gell criticizes existing 'anthropological' studies of art, for being too preoccupied with aesthetic value and not preoccupied enough with the central anthropological concern of uncovering 'social relationships' specifically the social contexts in which artworks are produced, circulated, and received.[19] Abduction is used as the basis of one gets from art to agency in the sense of a theory of how works of art can inspire a sensus communis, or the commonly-held views that a characteristic of a given society because they are shared by everyone in that society.[20] The question Gell asks in the book is, ‘how does initially to ‘speak’ to people?’ He answers by saying that “No reasonable person could suppose that art-like relations between people and things do not involve at least some form of semiosis.”[18] However, he rejects any intimation that semiosis can be thought of as a language because then he would have to admit to some pre-established existence of the sensus communis that he wants to claim only emerges afterward out of art. Abduction is the answer to this conundrum because the tentative nature of the abduction concept (Pierce likened it to guessing) means that not only can it operate outside of any pre-existing framework, but moreover, it can actually intimate the existence of a framework. As Gell reasons in his analysis, the physical existence of the artwork prompts the viewer to perform an abduction that imbues the artwork with intentionality. A statue of a goddess, for example, in some senses actually becomes the goddess in the mind of the beholder; and represents not only the form of the deity but also her intentions (which are adduced from the feeling of her very presence). Therefore through abduction, Gell claims that art can have the kind of agency that plants the seeds that grow into cultural myths. The power of agency is the power to motivate actions and inspire ultimately the shared understanding that characterizes any given society.
A great circle of a sphere is a circle that runs along the surface of that sphere so as to cut it into two equal halves, as distinct from a small circle. The great circle therefore has both the same circumference and the same center as the sphere. It is the largest circle that can be drawn on a given sphere.
Great circles serve as the analogue of "straight lines" in spherical geometry. See also spherical trigonometry and geodesic.
The great circle, also known as the Riemannian circle, is the path with the smallest curvature, and hence, an arc (or an orthodrome) of a great circle is the shortest path between two points on the surface. The distance between any two points on a sphere is therefore known as the great-circle distance. The great-circle route is the shortest path between two points across the surface of a sphere
In geometry, an ellipse (from Greek ἔλλειψις elleipsis, a "falling short") is a plane curve that results from the intersection of a cone by a plane in a way that produces a closed curve. Circles are special cases of ellipses, obtained when the cutting plane is perpendicular to the axis. An ellipse is also the locus of all points of the plane whose distances to two fixed points add to the same constant.
Ellipses are closed curves and are the bounded case of the conic sections, the curves that result from the intersection of a circular cone and a plane that does not pass through its apex; the other two (open and unbounded) cases are parabolas and hyperbolas. Ellipses also arise as images of a circle under parallel projection and some cases of perspective projection. It is also the simplest Lissajous figure, formed when the horizontal and vertical motions are sinusoids with the same frequency.
Elements of an ellipse
The ellipse and some of its mathematical properties.
An ellipse is a smooth closed curve which is symmetric about its center. The distance between antipodal points on the ellipse, or pairs of points whose midpoint is at the center of the ellipse, is maximum and minimum along two perpendicular directions, the major axis or transverse diameter, and the minor axis or conjugate diameter.[1]
The semimajor axis (denoted by a in the figure) and the semiminor axis (denoted by b in the figure) are one half of the major and minor diameters, respectively. These are sometimes called (especially in technical fields) the major and minor semi-axes,[2][3] the major and minor semiaxes,[4][5] or major radius and minor radius.[6][7][8][9] When a and b are equal, the foci coincide with the center, and the ellipse becomes a circle with radius a=b.
There are two special points F1 and F2 on the ellipse's major axis, on either side of the center, such that the sum of the distances from any point of the ellipse to those two points is constant and equal to the major diameter (2a). Each of these two points is called a focus of the ellipse.
The eccentricity of an ellipse, usually denoted by ε or e, is the ratio of the distance between the foci to the length of the major axis. The eccentricity is necessarily between 0 and 1; it is zero if and only if a=b, in which case the ellipse is a circle. As the eccentricity tends to 1, the ellipse gets a more elongated shape and tends either towards a line segment (see below) or a parabola, and the ratio a/b tends to infinity. The distance ae from a focal point to the centre is called the linear eccentricity of the ellipse.
Drawing ellipses
The pins-and-string method
Drawing an ellipse with two pins, a loop and a pen.
An ellipse can be drawn using two drawing pins, a length of string, and a pencil:
Push the pins into the paper at two points, which will become the ellipse's foci. Tie the string into a loose loop around the two pins. Pull the loop taut with the pencil's tip, so as to form a triangle. Move the pencil around, while keeping the string taut, and its tip will trace out an ellipse.
If the ellipse is to be inscribed within a specified rectangle, tangent to its four sides at their midpoints, one must first determine the position of the foci and the length of the string loop:
Let A,B,C,D be the corners of the rectangle, in clockwise order, with A-B being one of the long sides. Draw a circle centered on A, whose radius is the short side A-D. From corner B draw a tangent to the circle. The length L of this tangent is the distance between the foci. Draw two perpendicular lines through the center of the rectangle and parallel to its sides; these will be the major and minor axes of the ellipse. Place the foci on the major axis, symmetrically, at distance L/2 from the center.
To adjust the length of the string loop, insert a pin at one focus, and another pin at the opposite end of the major diameter. Loop the string around the two pins and tie it taut. Then draw the ellipse as above; it should fit snugly in the original rectangle.
Other methods
Trammel of Archimedes (elpsograph) animation
An ellipse can also be drawn using a ruler, a set square, and a pencil:
Draw two perpendicular lines M,N on the paper; these will be the major and minor axes of the ellipse. Mark three points A, B, C on the ruler. With one hand, move the ruler onto the paper, turning and sliding it so as to keep point A always on line M, and B on line N. With the other hand, keep the pencil's tip on the paper, following point C of the ruler. The tip will trace out an ellipse.
The trammel of Archimedes or ellipsograph is a mechanical device that implements this principle. The ruler is replaced by a rod with a pencil holder (point C) at one end, and two adjustable side pins (points A and B) that slide into two perpendicular slots cut into a metal plate. [10] The mechanism can be used with a router to cut ellipses from board material. The mechanism is also used in a toy called the "nothing grinder".
Approximations to ellipses
An ellipse of low eccentricity can be represented reasonably accurately by a circle with its centre offset. With the exception of Mercury, all the planets have an orbit whose minor axis differs from the major axis by less than half of one percent. To draw the orbit with a pair of compasses the centre of the circle should be offset from the focus by an amount equal to the eccentricity multiplied by the radius.
Decision theory in philosophy, mathematics and statistics is concerned with identifying the values, uncertainties and other issues relevant in a given decision, its rationality, and the resulting optimal decision. It is very closely related to the field of game theory.
Normative and descriptive decision theory
Most of decision theory is normative or prescriptive, i.e., it is concerned with identifying the best decision to take, assuming an ideal decision maker who is fully informed, able to compute with perfect accuracy, and fully rational. The practical application of this prescriptive approach (how people actually make decisions) is called decision analysis, and aimed at finding tools, methodologies and software to help people make better decisions. The most systematic and comprehensive software tools developed in this way are called decision support systems.
Since people usually do not behave in ways consistent with axiomatic rules, often their own, leading to violations of optimality, there is a related area of study, called a positive or descriptive discipline, attempting to describe what people will actually do. Since the normative, optimal decision often creates hypotheses for testing against actual behaviour, the two fields are closely linked. Furthermore it is possible to relax the assumptions of perfect information, rationality and so forth in various ways, and produce a series of different prescriptions or predictions about behaviour, allowing for further tests of the kind of decision-making that occurs in practice.
What kinds of decisions need a theory?
Choice under uncertainty
This area represents the heart of decision theory. The procedure now referred to as expected value was known from the 17th century. Blaise Pascal invoked it in his famous wager (see below), which is contained in his Pensées, published in 1670. The idea of expected value is that, when faced with a number of actions, each of which could give rise to more than one possible outcome with different probabilities, the rational procedure is to identify all possible outcomes, determine their values (positive or negative) and the probabilities that will result from each course of action, and multiply the two to give an expected value. The action to be chosen should be the one that gives rise to the highest total expected value. In 1738, Daniel Bernoulli published an influential paper entitled Exposition of a New Theory on the Measurement of Risk, in which he uses the St. Petersburg paradox to show that expected value theory must be normatively wrong. He also gives an example in which a Dutch merchant is trying to decide whether to insure a cargo being sent from Amsterdam to St Petersburg in winter, when it is known that there is a 5% chance that the ship and cargo will be lost. In his solution, he defines a utility function and computes expected utility rather than expected financial value.
In the 20th century, interest was reignited by Abraham Wald's 1939 paper[1] pointing out that the two central concerns of orthodox statistical theory at that time, namely statistical hypothesis testing and statistical estimation theory, could both be regarded as particular special cases of the more general decision problem. This paper introduced much of the mental landscape of modern decision theory, including loss functions, risk functions, admissible decision rules, a priori distributions, Bayes decision rules, and minimax decision rules. The phrase "decision theory" itself was first used in 1950 by E. L. Lehmann.[citation needed]
The rise of subjective probability theory, from the work of Frank Ramsey, Bruno de Finetti, Leonard Savage and others, extended the scope of expected utility theory to situations where only subjective probabilities are available. At this time it was generally assumed in economics that people behave as rational agents and thus expected utility theory also provided a theory of actual human decision-making behaviour under risk. The work of Maurice Allais and Daniel Ellsberg showed that this was clearly not so. The prospect theory of Daniel Kahneman and Amos Tversky placed behavioural economics on a more evidence-based footing. It emphasized that in actual human (as opposed to normatively correct) decision-making "losses loom larger than gains", people are more focused on changes in their utility states than the states themselves and estimation of subjective probabilities is severely biased by anchoring.
Castagnoli and LiCalzi (1996),[citation needed] Bordley and LiCalzi (2000)[citation needed] recently showed that maximizing expected utility is mathematically equivalent to maximizing the probability that the uncertain consequences of a decision are preferable to an uncertain benchmark (e.g., the probability that a mutual fund strategy outperforms the S&P 500 or that a firm outperforms the uncertain future performance of a major competitor.). This reinterpretation relates to psychological work suggesting that individuals have fuzzy aspiration levels (Lopes & Oden),[citation needed] which may vary from choice context to choice context. Hence it shifts the focus from utility to the individual's uncertain reference point.
Pascal's Wager is a classic example of a choice under uncertainty. The uncertainty, according to Pascal, is whether or not God exists. Belief or non-belief in God is the choice to be made. However, the reward for belief in God if God actually does exist is infinite. Therefore, however small the probability of God's existence, the expected value of belief exceeds that of non-belief, so it is better to believe in God. (There are several criticisms of the argument.)
Intertemporal choice
This area is concerned with the kind of choice where different actions lead to outcomes that are realised at different points in time. If someone received a windfall of several thousand dollars, they could spend it on an expensive holiday, giving them immediate pleasure, or they could invest it in a pension scheme, giving them an income at some time in the future. What is the optimal thing to do? The answer depends partly on factors such as the expected rates of interest and inflation, the person's life expectancy, and their confidence in the pensions industry. However even with all those factors taken into account, human behavior again deviates greatly from the predictions of prescriptive decision theory, leading to alternative models in which, for example, objective interest rates are replaced by subjective discount rates.
Competing decision makers
Some decisions are difficult because of the need to take into account how other people in the situation will respond to the decision that is taken. The analysis of such social decisions is more often treated under the label of game theory, rather than decision theory, though it involves the same mathematical methods. From the standpoint of game theory most of the problems treated in decision theory are one-player games (or the one player is viewed as playing against an impersonal background situation). In the emerging socio-cognitive engineering the research is especially focused on the different types of distributed decision-making in human organizations, in normal and abnormal/emergency/crisis situations.
The signal detection theory is based on the Decision theory.
Complex decisions
Other areas of decision theory are concerned with decisions that are difficult simply because of their complexity, or the complexity of the organization that has to make them. In such cases the issue is not the deviation between real and optimal behaviour, but the difficulty of determining the optimal behaviour in the first place. The Club of Rome, for example, developed a model of economic growth and resource usage that helps politicians make real-life decisions in complex situations[citation needed].
Paradox of choice
Observed in many cases is the paradox that more choices may lead to a poorer decision or a failure to make a decision at all. It is sometimes theorized to be caused by analysis paralysis, real or perceived, or perhaps from rational ignorance. A number of researchers including Sheena S. Iyengar and Mark R. Lepper have published studies on this phenomenon.[2] This analysis was popularized by Barry Schwartz in his 2004 book, The Paradox of Choice.
Statistical decision theory
Several statistical tools and methods are available to organize evidence, evaluate risks, and aid in decision making. The risks of Type I and type II errors can be quantified (estimated probability, cost, expected value, etc) and rational decision making is improved.
One example shows a structure for deciding guilt in a criminal trial:
Actual condition
Guilty
Not guilty
Decision
Verdict of 'guilty'
True Positive
False Positive (i.e. guilt reported unfairly) Type I error
Verdict of 'not guilty'
False Negative (i.e. guilt not detected) Type II error
True Negative
Alternatives to decision theory
A highly controversial issue is whether one can replace the use of probability in decision theory by other alternatives.
Probability theory
The Advocates of probability theory point to:
the work of Richard Threlkeld Cox for justification of the probability axioms,
the Dutch book paradoxes of Bruno de Finetti as illustrative of the theoretical difficulties that can arise from departures from the probability axioms, and
the complete class theorems, which show that all admissible decision rules are equivalent to the Bayesian decision rule for some utility function and some prior distribution (or for the limit of a sequence of prior distributions). Thus, for every decision rule, either the rule may be reformulated as a Bayesian procedure, or there is a (perhaps limiting) Bayesian rule that is sometimes better and never worse.
Alternatives to probability theory
The proponents of fuzzy logic, possibility theory, Dempster-Shafer theory and info-gap decision theory maintain that probability is only one of many alternatives and point to many examples where non-standard alternatives have been implemented with apparent success; notably, probabilistic decision theory is sensitive to assumptions about the probabilities of various events, while non-probabilistic rules such as minimax are robust, in that they do not make such assumptions. [3]
General criticism
A general criticism of decision theory based on a fixed universe of possibilities is that it considers the "known unknowns", not the "unknown unknowns": it focuses on expected variations, not on unforeseen events, which some argue (as in black swan theory) have outsized impact and must be considered – significant events may be "outside model". This line of argument, called the ludic fallacy, is that there are inevitable imperfections in modeling the real world by particular models, and that unquestioning reliance on models blinds one to their limits.
Fuzzy logic is a form of multi-valued logic derived from fuzzy set theory to deal with reasoning that is approximate rather than precise. In contrast with "crisp logic", where binary sets have binary logic, the fuzzy logic variables may have a membership value of not only 0 or 1 – that is, the degree of truth of a statement can range between 0 and 1 and is not constrained to the two truth values of classic propositional logic.[1] Furthermore, when linguistic variables are used, these degrees may be managed by specific functions.
Fuzzy logic emerged as a consequence of the 1965 proposal of fuzzy set theory by Lotfi Zadeh.[2][3] Though fuzzy logic has been applied to many fields, from control theory to artificial intelligence, it still remains controversial among most statisticians, who prefer Bayesian logic, and some control engineers, who prefer traditional two-valued logic.
Degrees of truth
Fuzzy logic and probabilistic logic are mathematically similar – both have truth values ranging between 0 and 1 – but conceptually distinct, due to different interpretations. Fuzzy logic corresponds to "degrees of truth", while probabilistic logic corresponds to "probability, likelihood"; as these differ, fuzzy logic and probabilistic logic yield different models of the same real-world situations.
Both degrees of truth and probabilities range between 0 and 1 and hence may seem similar at first. Truth represents membership in vaguely defined sets, not likelihood of some event or condition as in probability theory. For example, let a 100 ml glass contain 30 ml of water. Then we may consider two concepts: Empty and Full. The meaning of each of them can be represented by a certain fuzzy set. Then one might define the glass as being 0.7 empty and 0.3 full. Note that the concept of emptiness would be subjective and thus would depend on the observer or designer. Another designer might equally well design a set membership function where the glass would be considered full for all values down to 50 ml. It is essential to realize that fuzzy logic uses truth degrees as a mathematical model of the vagueness phenomenon while probability is a mathematical model of randomness. A probabilistic setting would first define a scalar variable for the fullness of the glass, and second, conditional distributions describing the probability that someone would call the glass full given a specific fullness level. This model, however, has no sense without accepting occurrence of some event, e.g. that after a few minutes, the glass will be half empty. Note that the conditioning can be achieved by having a specific observer that randomly selects the label for the glass, a distribution over deterministic observers, or both. Consequently, probability has nothing in common with fuzziness, these are simply different concepts which superficially seem similar because of using the same unit interval of real numbers [0,1]. Still, since theorems such as De Morgan's have dual applicability and properties of random variables are analogous to properties of binary logic states, one can see where the confusion might arise.
Applying truth values
A basic application might characterize subranges of a continuous variable. For instance, a temperature measurement for anti-lock brakes might have several separate membership functions defining particular temperature ranges needed to control the brakes properly. Each function maps the same temperature value to a truth value in the 0 to 1 range. These truth values can then be used to determine how the brakes should be controlled.
Fuzzy logic temperature
In this image, the meaning of the expressions cold, warm, and hot is represented by functions mapping a temperature scale. A point on that scale has three "truth values" — one for each of the three functions. The vertical line in the image represents a particular temperature that the three arrows (truth values) gauge. Since the red arrow points to zero, this temperature may be interpreted as "not hot&q