4.1 Event Type
4.2 Pattern
4.3 Number
4.4 Condition
4.5 Function
4.6 Condition Function
4.7 Counter Function
4.8 Conditional Counter
4.9 More Examples
1 Introduction
The JHAX
Event
Patterns Website is a web-based program that manipulates and
displays data
about world events. This page describes technical aspects of the
program.
Valerie Hudson working
with Philip Schrodt (who also gathered the data sources) recruited Ray
Whitmer to create the programming . This page
focuses on the program,
ignoring many significant things such as qualities of the data and the
search for rules that might usefully predict events, etc.
1.1 Selecting a Server
The event pattern system currently runs on
more than one server. Patterns are only available on servers where they have
been saved. You can select another server to save patterns there as well.
Patterns will only work on a server where the appropriate event data set has
been uploaded.
2 Saving Event Pattern Displays
2.1 Retrieving
Enter the name of the event pattern display you wish to retrieve
and click on the Retrieve button or go to the
Patterns List to retrieve a stored event pattern display.
2.2 Saving
Enter the name and password, if any, of the
event pattern display you wish to save and click 'Save'. The name may begin
with an index path seperated from the name by "/" to collect patterns into a
seperate listing.
2.3 Removing
Enter the name and password, if any, of the
event pattern display you wish to remove and click 'Remove'.
2.4 Password
The password controls whether you
can modify or remove an event pattern display, because it must match the value
specified when the event pattern display was first saved.
2.5 List
The list allows you to see all patterns
that have been saved on the current server. An index path may be given to list
the patterns saved in that particular index path, or it may be left blank to list
the patterns in the default index path.
3 The Page is the Program
The main page of the web site presents
a set of field values that control the program, filled in with working
default values. This page describes how to modify the values to
get exactly the desired output. When the form values have
been modified, pressing any button at the bottom of the form causes the
values to be
incorporated into the URL shown at the top of the browser, so the
address of the page with the settings built-in can be easily bookmarked
or emailed.
3.1 Data Source
The data source field allows selection
of which file of data to filter
and display. Currently only one may be selected at a time.
Each data source is displayed by the name of the file as unarchived
from
http://www.ukans.edu/~keds/data.html. See that website information
on the data sets,
and to see what each contains. When using a data set with the
program, it may important to know when the data set starts and ends,
which actors are represented in the data, and so on to produce a
non-empty display.
3.2 Actors' and Recipients' Maps
The actors' map field
identifies
all actors in the data that will be uniquely represented on the
display, as defined further on in the display description.
The recipients' map field
identifies
all recipients in the data that will be uniquely represented on the
display, as defined further on in the display description.
These field values might typically map source actors/recipients to A
and B
, for dyadic interactions between two
parties.
Event data often has far more actors represented than can be
well-represented seperately in the display. Mapping specific
actors to generic names (letters
of the alphabet) also permits the same patterns and display description
to be more-easily
reused
with different actors.
The actors' and recipients' map fields may be changed to map
different primary actors into the existing scheme, to add more specific
mappings into the dyadic scheme, or to go beyond the
dyadic scheme (for example using further letters of the alphabet for
n-way
event displays). All otherwise-unmapped actors and recipients are
mapped to
UNK
, which can also be referenced in the patterns or display.
The actors' and recipients' map values are lists of comma-seperated mappings. Each
mapping in a list has a source followed by equals and a target name, as follows:
<source actor>[*] =
<target actor>[, ...]
Like
most parameters of this program, the source and target actors are
case-sensitive, so the display
description must be coordinated using the proper case.
The source actor or recipient may end in an asterisk to map all actors
or recipients that start with that source.
Where multiple mappings match the same source actor or recipient, the first
matching declaration takes precedence. For example, the in the
mapping
list
FO*:A,BA*:B,BAR*:A
, the mapping
BAR*:A
does
not map anything since
BA*:B
already included everything starting with
BAR*
, but in
the example
FOO*:B,FO*:A,BAR*:A,BA*:B
, the mappings
FOO*:B
and
BAR*:A
make exceptions to
FO*:A
and
BA*:B
respectively since they occur first.
3.3 Event Codes
The event codes field maps the numeric
codes of source events to a
usually-smaller set of event types to be presented in the display.The
comma-seperated values in this list alternate target verbs and the
source integer codes in ascending order starting with the target for
negative infinity and concluding with the target for positive infinity,
as follows:
<code>[, <verb>,
<code>[...]]
For example the list
previous,10,foo,100,bar,500,unknown
maps
all values below
10
to
previous
, values
from
10
to
99
inclusive to
foo
,
values from
100
to
499
inclusive to
bar
,
and values greater than or equal to
500
to
unknown
.
The values
previous
,
foo
,
bar
,
and
unknown
will be presentable in this case in the
display description.
To add a
distinct
verb for code
200
,
the list becomes
previous,10,foo,100,bar,200,distinct,201,bar,500,unknown
.
Note that the
bar
mapping has to be reinstated at
201
or the
distinct
mapping would have been in effect until
500
.
3.4 Days per Unit
The days per unit field contains an
integer that controls, among other things, how many days of events will
be
displayed on each line of the display. When the number of days
per row is changed, this also affects
patterns that all reference time units explicitly or implicitly.
With 2 days per unit a sum of 10 units duration will
sum events from
20 days,
with a week per unit this becomes
10 weeks, and with a day per unit,
this becomes 10
days,
etc.
3.5 Patterns
The patterns text box allows assignment
of names to counter
expressions, one per line. Assigned pattern names may be used in
other patterns or in
the display. Each line has a pattern identifier followed by the
counter expression, as follows:
<pattern identifier> =
<counter expression>
Counter expressions are described in
section 4.
3.5.1 Pattern Comments
The pattern comments text box can be used to
explain what the rules in the pattern represent.
3.6 Display
The display text box specifies how
specific events will be represented
on the display. Each line of text consists of four
comma-seperated values, as follows:
<column number>,
<symbol shape>, <symbol color>, <counter expression>
A description follows of
these fields except for the counter expression, which is described in
section 4.
3.6.1 Column Number
The column number specifies the column
of the graphical output to which
the line outputs symbols (the column's width, symbol size, offsets,
etc. are governed by the previously-described fields). If two
lines in the display description output symbols to the same column, the
symbols come out in the order that the lines appeared in the display
description. Referencing higher column numbers causes the the
graphical output to become wider to include the specified column.
0
is interpreted as the number of an invisible column,
which can be used to temporarily remove something from the graphical
output. Positive column numbers produce output of symbols in the
respective columns starting at the left side. Negative column
numbers output to the same columns as the corresponding positive
numbers, but starting at the right side of the column and proceeding to
the left, possibly colliding with output to the positive column number,
but with reverse offset to avoid collisions as long as possible.
3.6.2 Symbol Shape
The symbol shape specifies the
program-supported shape that is to be
drawn to symbolize the particular output. The program
supports
BoxFill
— A filled-in box
BoxOutline
— An outline box
PointFill
— A filled-in arrow point in the column direction
PointLeftFill
— A filled-in left arrow point
PointRightFill
— A filled-in right arrow point
PointTopFill
— A filled-in up arrow point
PointBottomFill
— A filled-in down arrow point
PointOutline
— An outline arrow point in the column direction
PointLeftOutline
— An outline left arrow point
PointRightOutline
— An outline right arrow point
PointTopOutline
— An outline up arrow point
PointBottomOutline
— An outline down arrow point
Angle
— An angle bracket in the column direction
AngleLeft
— A left angle bracket
AngleRight
— A right angle bracket
AngleTop
— An angle bracket pointing up
AngleBottom
— An angle bracket pointing down
DiamondFill
— A filled diamond
DiamondOutline
— An outline diamond
X
— An X
XTopFill
— An X with top filled and bottom connected
XBottomFill
— An X with bottom filled and top connected
XTopBottomFill
— An X with top and bottom filled
XTopBottomOutline
— An X with top and bottom connected
XLeftFill
— An X with left filled and right connected
XRightFill
— An X with right filled and left connected
XLeftRightFill
— An X with left and right filled
XLeftRightOutline
— An X with left and right connected
Slash
— A slash from bottom left to top right
SlashTopFill
— A slash from bottom left to top right, top corner filled
SlashBottomFill
— A slash from bottom left to top right, bottom corner filled
SlashTopOutline
— A slash from bottom left to top right, top corner connected
SlashBottomOutline
— A slash from bottom left to top right, bottom corner connected
Backslash
— A slash from bottom left to top right
BackslashTopFill
— A slash from bottom left to bottom right, top corner filled
BackslashBottomFill
— A slash from top left to bottom right, bottom corner filled
BackslashTopOutline
— A slash from top left to bottom right, top corner connected
BackslashBottomOutline
— A slash from top left to bottom right, bottom corner connected
Cross
— Crossing horizontal and vertical bars
Horizontal
— Horizontal bar
Vertical
— Vertical bar
Space
— Empty space, displacing next symbol
The points or angles that go in the direction of the column always have
their tips in the direction of progression, meaning
that outputting to a negative column number makes them point left and
outputting to a positive column number makes them point right.
Support for additional symbol shapes will likely be added at some time
in the future.
3.6.3 Symbol Color
The symbol color specifies which color
to use to paint the symbol using
an 8-digit hexadecimal (base 16 with digits 0123456789abcdef
)
number which specifies, two digits at a time the value of the alpha,
red, green, and blue components of the color (00
,01
,02
,...fd
,fe
,ff
).
The red, green, and blue components specify how much of the respective
colors are mixed to produce the corresponding color simulation in
humans (dogs and partially-colorblind humans detect two color
components, shrimp detect twelve, and publishers may only detect one or
less, making the system biased towards normal humans).
The alpha component tells how opaque the color is.
For example ff000000
is opaque black (red green and blue
all 0
), 00000000
is transparent black
(totally invisible), 80000000
is half-transparent black (things show through 50%
), e00000ff
is an eighth transparent blue (things show through 12.5%
with non-zero blue), e0808080
is an eighth
transparent gray (things show through 75%
with red green
and blue equal), etc.
3.6.4 Counter
Expression
The counter expression describes the
counter which specifies how many
of the symbols to draw on a particular row of the display. See
the section on counter expressions.
3.7 Starting and
Ending Date
Starting or ending date fields may be
used to limit (or even artificially extend)
the display. This is especially useful to spare the browser (and
the web server) the load of redisplaying the entire data set
graphically. A browser which is in the process of downloading too
much data might also be stopped by pressing the stop button or
escape key (or closing
and restarting the browser). If left unspecified, the
starting or ending date of the data source will be used.
A properly-specified date for this program consists of 8 digits which
are run together with no seperating spaces or punctuation. The
first four digits are the year number, the next two the month, and the
last two are the day, as follows:
<year digit><year
digit><year digit><year digit><month
digit><month digit><day digit><day digit>
Anything else (except leaving the field
blank to specify no date) should cause an error.
3.8 Symbol Size
The symbol size field tells the program
how many dots high and wide in
the resulting display of a symbol should be in the graphical
output. If specified as a single number, the same value is used
for width and height, but two comma-seperated values may be given to
independently specify width and height for less-symmetrical sizes as
follows:
<symbol width>, <symbol
height>
For example,
6,12
specifies symbols that are twice as
high as they are
wide.
Changing the symbol size is one easy way to change the width and length
of the the resulting page since symbol size directly contributes to
page size.
3.9 Pixel Offset
The pixel offset field tells the
program how many dots to shift when
offsetting symbols to avoid total occlusion of different symbols in the
graphical display. This effects the total width of the display,
because when space is reserved for symbols in the display, each
possible offset reserves this much space horizontally and
vertically. If specified as a single number, the same value is
used horizontally and vertically, but two comma-seperated values may be
given to independently specify the horizontal and vertical pixel
offsets, as follows:
<horizontal pixel offset>,
<vertical pixel offset>
These values may also be specified as negative numbers to cause the
offsetting to occur on the opposite side of the symbol, i.e. the value
0,-3
eliminates horizontal offsetting and makes the vertical offsetting
occur by three pixels at each offset in the opposite direction from
default offsetting. The offsetting is also reversed when dealing
with left-justified versus right-justified columns in order to make
total occlusion of symbols from opposite sides of the column less
likely.
310 Dimensions
The dimensions field contains three
other comma-seperated values that
are used in the layout of the graphical output, as follows:
<column width>,
<height>, <offsets per row>
Column width is the number of symbols that may be placed in each
column. As with symbol size and pixel offset, this directly
effects the total width of the graphical output.
Height is the number of rows of symbols that may be placed in a single
graphic before exhausing the space. This effects the total height
of each graphic and inversely effects the total number of graphics that
will be required to present the requested output.
Offsets per row is the number of offsets that can be distinguished in a
particular row before total occlusion occurs. Together with pixel
offset, this controls how much extra space is reserved for each symbol
to allow offsetting, also effecting the width and height of the
graphical output.
3.11 Image Type
The image type field specifies which
browser-supported image format to
use, Jpeg or PNG. The best choice is according to what your
browser supports. While JPeg is slightly older and may be
supported in more browsers, PNG is more-suitable to the task because
Jpeg is designed to compresses pictures containing continuously-varying
shades and loses things in the process, which means that if you choose
Jpeg, the graphics will take longer to download and the symbols will be
fuzzier in color and position.
GIF is an older format that deals better with non-continuously-shaded
graphics such as the output of this program, but after the format
became popular, Unisys made it legally risky to use the format in a
program, which is why the World Wide Web produced PNG (Portable Network
Graphics) an unencumbered and more-advanced replacement that is
supported by later versions of standards-compliant browsers.
3.12 None, Graphic,
and Text Buttons
Pressing one of these buttons makes a
request of the server to incorporate the current form fields into the
URL (so they can be bookmarked), adding the specified display of events
to the page.
If none is pressed, any graphic or text display of events is removed,
which may be useful for bookmarking the undisplayed form.
If graphic is pressed, the graphical output of the program is displayed.
If text is pressed, a textual version of the output of the program is
displayed.
4 Counter Expressions
A counter expression computes some
number usually based on event counts. This
can be an event type, a pattern name, a number, a function,
an automatically-converted condition, or a conditional counter.
4.1 Event Type
When used as a counter expression, the event
type refers to the number of matching events in the current unit of
time.
An event type is the actor, followed by dash,
followed by the recipient, followed by dash, followed by the verb, as
follows:
<source actor>-<target
actor>-<verb>
The event type is aways interpreted after mapping the
actors and codes, so the source and target actors must each be
targets in the actor map field and the verb must be from the event
codes field. If, for example, the actor map produces
source and target actors
A
and
B
and the
code maps to the target verb
MaterialCooperation
, then
the corresponding event type to refer to the count of such events is
A-B-MaterialCooperation
.
4.2 Pattern
A pattern is a name (with no dashes)
that refers to the counter expression of a previously-declared pattern.
4.3 Number
A number is an integer constant that is
returned instead of a
counter. This is most-useful as part of a more-complex counter
expression, since it is boring to display the same number of symbols on
every row of the display. Asking for the sum of a constant over a
time range will return the constant once for every time unit, because
it behaves like other expressions during ranges, but as a constant
value rather than as the result of counting something.
A condition is an expression which can be evaluated to be either true or
false. It can be either a simple condition or a function condition.
A primitive condition consists of any counter expression, followed by the
comparison, followed by another counter expression. It is possible to
compare conditions for an exclusive-or-like effect. Conditions are converted
before comparison, so compare to 1
for true or 0
for false.
A simple condition is any counter expression, followed by a comparor, (one
of <
, >
, <=
, >=
, =
, !=
)
followed by another counter expression. The comparor compares the counts of
the two counter expressions and returns true or false. If a condition is directly
used by a display, pattern, function, or other place where a counter is expected,
it becomes a counter condition, which returns 1
or 0
for true or false.
If a counter is used where a condition is required, it is an error. A
condition can compare a pattern to 0 to convert a condition converted to a
counter as a pattern back into a condition.
patA>0
A function is a well-recognized name, followed by an opening parenthesis,
followed by one or more comma-seperated expressions, followed by a closing
parenthesis.
sum(patA,patB)
A function may optionally be followed by brackets containing one or two
colon-seperated numbers which indicate the offset (the first number, which
may not be negative) and depth (the second number, which must be positive) of
the desired value fields. The default offset is 0,indicating the currently
computed unit and the default depth is 1 meaning look at the counter only in
the current unit, not going back any additional units.
sum(patA,patB)[5:20]
A function (with or without bracketed numbers), may optionally be followed by
a slash and a number, which causes the resulting value to be divided by the
specified number, with any remainder discarded.
sum(patA,patB)[0:4]/4
A negative number causes the sign of the function result to be changed, which
is useful for subtracting using the sum function.
sum(patA,patB,sum(patC,patD)/-2)[0:4]
Condition functions operate on condition expressions and produce conditions.
The counter functions are all
, any
, and not
,
which compute the logical and, logical or, and opposite of conditions.
all(patA>=300,patA<=600,any(patA>patB,patA<1000),not(patB=1))
As with any condition, where a condition function is used as a counter
expression, it becomes 1
or 0
. It is an error
to try to make a condition function operate directly on a counter expression
such as a pattern, but a pattern can be easily incorporated via a simple
condition.
any(patA>0,patB>0)
Counter functions operate on counter expressions and produce counter
values. The counter functions are sum
, min
, and max
,
which compute the sum, minimum, and maximum of a set of counters.
min(sum(patA)[0:4],max(patB)[0:4])
Where conditions are used as a counter expression,
they are automatically converted to 1
or 0
if they are
true or false. It is possible to cause true to be converted to other values
by following the condition with ?
and then
a number or other counter expression.
patA>3?patA
Other values for false are also possible by following the true value by :
which is followed by the false value.
patA>patB?9:patC
sum(A-B-MaterialCooperation,A-B-VerbalCooperation)[5]
The sum of material and verbal
cooperations from A
to B
five units of time
before the current time.
sum(A-B-MaterialCooperation,A-B-VerbalCooperation)[0:10]
The sum of material and verbal
cooperations from A
to B
during the ten
units of time ending at the current time.
sum(A-B-MaterialCooperation,A-B-VerbalCooperation)[5:10]/3
The sum of material and verbal
cooperations from A
to B
during the ten
units of time ending five units of time before the current time,
divided by 3
.
min(A-B-MaterialCooperation,A-B-MaterialConflict)
The lessor of the material cooperation
or the material conflict from A
to B
at the
current time.
min(A-B-MaterialCooperation,A-B-MaterialConflict)[0:50]
The least count material cooperation
or
material conflict from A
to B
in the 50
days ending at the current time.
min(sum(A-B-MaterialCooperation,A-B-MaterialConflict))[0:50]
The least of the sums of material
cooperation and material conflict in each day from A
to B
in the 50 days ending at the current time.
min(sum(A-B-MaterialCooperation,A-B-MaterialConflict)[3:5])[0:50]
The least of the 3-day-old 5-day sums
of material cooperation and material conflict from A
to B
in the 50 days ending at the current time.
min(sum(A-B-MaterialCooperation,A-B-MaterialConflict)[0:5])[3:50]
The same interpretation as the
previous
example.
min(max(sum(A-B-MaterialCooperation)[0:1])[0:1])[0:1]
Since minimums, maximums, and sums of
a
single value are that value and an age of 0 and duration of 1 is just
default interpretation in the current time, this expression should be
simplified to
the unadorned event type with no additional expression, i.e. A-B-MaterialCooperation
.
min(A-B-MaterialCooperations)[5]/2
Which operator (min
, max
,
or sum
) was specified is irrelevant since it applies to a
single event type and time unit, but some operator was required so that
dividing and aging could be specified.
Note that the division makes little sense in this sort of case because
the counter is at most 1.
min(A-B-MaterialCooperations)[0:5]/2
The operator is now relevant, because
minimization is applied over a period of 5 units of time, the result of
which is divided by 2.
min(1,sum(A-B-MaterialCooperations,
B-A-MaterialCooperations
)[0:5]/10)
Count the number of material
cooperations between A
and B
in either
direction over the latest 5 days, scaling by a tenth so that a symbol
for 10 or more, but never display more than 1 symbol.
all(sum(B-A-MaterialConflict[5:5])>4,sum(A-B-MaterialConflict)[0:5]>10)
Return 1 symbol whenever at least 4
material conflicts from B to A spread over at most 5 units of time is
followed by at least 10 material conflicts from A to B spread over at
most 5 units of time.