JHAX Event Patterns Guide

Created by Ray Whitmer
Revised on $Date: 2005-11-21 14:19:34 $ GMT

0 Contents

1 Introduction
1.1 Selecting a Server
2 Saving Event Pattern Displays
2.1 Retrieving
2.2 Saving
2.3 Removing
3 The Page is the Program
3.1 Data Source
3.2 Actors' and Recipients' Maps
3.3 Event Codes
3.4 Days per Unit
3.5 Patterns
3.5.1 Pattern Comments
3.6 Display
3.6.1 Column Number
3.6.2 Symbol Shape
3.6.3 Symbol Color
3.6.4 Counter
3.7 Starting and Ending Date
3.8 Symbol Size
3.9 Pixel Offset
3.10 Dimensions
3.11 Image Type
3.12 None, Graphic, and Text Buttons
4 Counter Expressions
4.1 Event Type
4.2 Pattern
4.3 Number
4.4 Condition
4.5 Function
4.6 Condition Function
4.7 Counter Function
4.8 Conditional Counter
4.9 More Examples

1 Introduction

The JHAX Event Patterns Website is a web-based program that manipulates and displays data about world events.  This page describes technical aspects of the program.

Valerie Hudson working with Philip Schrodt (who also gathered the data sources) recruited Ray Whitmer to create the programming .  This page focuses on the program, ignoring many significant things such as qualities of the data and the search for rules that might usefully predict events, etc.

1.1 Selecting a Server

The event pattern system currently runs on more than one server. Patterns are only available on servers where they have been saved. You can select another server to save patterns there as well. Patterns will only work on a server where the appropriate event data set has been uploaded.

2 Saving Event Pattern Displays

2.1 Retrieving

Enter the name of the event pattern display you wish to retrieve and click on the Retrieve button or go to the Patterns List to retrieve a stored event pattern display.

2.2 Saving

Enter the name and password, if any, of the event pattern display you wish to save and click 'Save'. The name may begin with an index path seperated from the name by "/" to collect patterns into a seperate listing.

2.3 Removing

Enter the name and password, if any, of the event pattern display you wish to remove and click 'Remove'.

2.4 Password

The password controls whether you can modify or remove an event pattern display, because it must match the value specified when the event pattern display was first saved.

2.5 List

The list allows you to see all patterns that have been saved on the current server. An index path may be given to list the patterns saved in that particular index path, or it may be left blank to list the patterns in the default index path.

3 The Page is the Program

The main page of the web site presents a set of field values that control the program, filled in with working default values.  This page describes how to modify the values to get exactly the desired output.  When the form values have been modified, pressing any button at the bottom of the form causes the values to be incorporated into the URL shown at the top of the browser, so the address of the page with the settings built-in can be easily bookmarked or emailed.

3.1 Data Source

The data source field allows selection of which file of data to filter and display.  Currently only one may be selected at a time.  Each data source is displayed by the name of the file as unarchived from http://www.ukans.edu/~keds/data.html.  See that website information on the data sets, and to see what each contains.  When using a data set with the program, it may important to know when the data set starts and ends, which actors are represented in the data, and so on to produce a non-empty display.

3.2 Actors' and Recipients' Maps

The actors' map field identifies all actors in the data that will be uniquely represented on the display, as defined further on in the display description.  The recipients' map field identifies all recipients in the data that will be uniquely represented on the display, as defined further on in the display description.  These field values might typically map source actors/recipients to A and B, for dyadic interactions between two parties. Event data often has far more actors represented than can be well-represented seperately in the display.  Mapping specific actors to generic names (letters of the alphabet) also permits the same patterns and display description to be more-easily reused with different actors.

The actors' and recipients' map fields may be changed to map different primary actors into the existing scheme, to add more specific mappings into the dyadic scheme, or to go beyond the dyadic scheme (for example using further letters of the alphabet for n-way event displays).  All otherwise-unmapped actors and recipients are mapped to UNK, which can also be referenced in the patterns or display.

The actors' and recipients' map values are lists of comma-seperated mappings.  Each mapping in a list has a source followed by equals and a target name, as follows:

<source actor>[*] = <target actor>[, ...]

Like most parameters of this program, the source and target actors are case-sensitive, so the display description must be coordinated using the proper case.

The source actor or recipient may end in an asterisk to map all actors or recipients that start with that source.  Where multiple mappings match the same source actor or recipient, the first matching declaration takes precedence.  For example, the in the mapping list FO*:A,BA*:B,BAR*:A, the mapping BAR*:A does not map anything since BA*:B already included everything starting with BAR*, but in the example FOO*:B,FO*:A,BAR*:A,BA*:B, the mappings FOO*:B and BAR*:A make exceptions to FO*:A and BA*:B respectively since they occur first.

3.3 Event Codes

The event codes field maps the numeric codes of source events to a usually-smaller set of event types to be presented in the display.The comma-seperated values in this list alternate target verbs and the source integer codes in ascending order starting with the target for negative infinity and concluding with the target for positive infinity, as follows:

<code>[, <verb>, <code>[...]]

For example the list previous,10,foo,100,bar,500,unknown maps all values below 10 to previous, values from 10 to 99 inclusive to foo, values from 100 to 499 inclusive to bar, and values greater than or equal to 500 to unknown.  The values previous, foo, bar, and unknown will be presentable in this case in the display description.

To add a distinct verb for code 200, the list becomes previous,10,foo,100,bar,200,distinct,201,bar,500,unknown.  Note that the bar mapping has to be reinstated at 201 or the distinct mapping would have been in effect until 500.

3.4 Days per Unit

The days per unit field contains an integer that controls, among other things, how many days of events will be displayed on each line of the display.  When the number of days per row is changed, this also affects patterns that all reference time units explicitly or implicitly.  With 2 days per unit a sum of 10 units duration will sum events from 20 days, with a week per unit this becomes 10 weeks, and with a day per unit, this becomes 10 days, etc.

3.5 Patterns

The patterns text box allows assignment of names to counter expressions, one per line.  Assigned pattern names may be used in other patterns or in the display.  Each line has a pattern identifier followed by the counter expression, as follows:

<pattern identifier> = <counter expression>

Counter expressions are described in section 4.
3.5.1 Pattern Comments
The pattern comments text box can be used to explain what the rules in the pattern represent.

3.6 Display

The display text box specifies how specific events will be represented on the display.  Each line of text consists of four comma-seperated values, as follows:

<column number>, <symbol shape>, <symbol color>, <counter expression>

A description follows of these fields except for the counter expression, which is described in section 4.
3.6.1 Column Number
The column number specifies the column of the graphical output to which the line outputs symbols (the column's width, symbol size, offsets, etc. are governed by the previously-described fields).  If two lines in the display description output symbols to the same column, the symbols come out in the order that the lines appeared in the display description.  Referencing higher column numbers causes the the graphical output to become wider to include the specified column.

0 is interpreted as the number of an invisible column, which can be used to temporarily remove something from the graphical output.  Positive column numbers produce output of symbols in the respective columns starting at the left side.  Negative column numbers output to the same columns as the corresponding positive numbers, but starting at the right side of the column and proceeding to the left, possibly colliding with output to the positive column number, but with reverse offset to avoid collisions as long as possible.
3.6.2 Symbol Shape
The symbol shape specifies the program-supported shape that is to be drawn to symbolize the particular output.  The program supports
3.6.3 Symbol Color
The symbol color specifies which color to use to paint the symbol using an 8-digit hexadecimal (base 16 with digits 0123456789abcdef) number which specifies, two digits at a time the value of the alpha, red, green, and blue components of the color (00,01,02,...fd,fe,ff).

The red, green, and blue components specify how much of the respective colors are mixed to produce the corresponding color simulation in humans (dogs and partially-colorblind humans detect two color components, shrimp detect twelve, and publishers may only detect one or less, making the system biased towards normal humans).

The alpha component tells how opaque the color is.

For example ff000000 is opaque black (red green and blue all 0), 00000000 is transparent black (totally invisible), 80000000 is half-transparent black (things show through 50%), e00000ff is an eighth transparent blue (things show through 12.5% with non-zero blue), e0808080 is an eighth transparent gray (things show through 75% with red green and blue equal), etc.
3.6.4 Counter Expression
The counter expression describes the counter which specifies how many of the symbols to draw on a particular row of the display.  See the section on counter expressions.

3.7 Starting and Ending Date

Starting or ending date fields may be used to limit (or even artificially extend) the display.  This is especially useful to spare the browser (and the web server) the load of redisplaying the entire data set graphically.  A browser which is in the process of downloading too much data might also be stopped by pressing the stop button or escape key (or closing and restarting the browser).  If left unspecified, the starting or ending date of the data source will be used.

A properly-specified date for this program consists of 8 digits which are run together with no seperating spaces or punctuation.  The first four digits are the year number, the next two the month, and the last two are the day, as follows:

<year digit><year digit><year digit><year digit><month digit><month digit><day digit><day digit>

Anything else (except leaving the field blank to specify no date) should cause an error.

3.8 Symbol Size

The symbol size field tells the program how many dots high and wide in the resulting display of a symbol should be in the graphical output.  If specified as a single number, the same value is used for width and height, but two comma-seperated values may be given to independently specify width and height for less-symmetrical sizes as follows:

<symbol width>, <symbol height>

For example, 6,12 specifies symbols that are twice as high as they are wide.

Changing the symbol size is one easy way to change the width and length of the the resulting page since symbol size directly contributes to page size.

3.9 Pixel Offset

The pixel offset field tells the program how many dots to shift when offsetting symbols to avoid total occlusion of different symbols in the graphical display.  This effects the total width of the display, because when space is reserved for symbols in the display, each possible offset reserves this much space horizontally and vertically.  If specified as a single number, the same value is used horizontally and vertically, but two comma-seperated values may be given to independently specify the horizontal and vertical pixel offsets, as follows:

<horizontal pixel offset>, <vertical pixel offset>

These values may also be specified as negative numbers to cause the offsetting to occur on the opposite side of the symbol, i.e. the value 0,-3 eliminates horizontal offsetting and makes the vertical offsetting occur by three pixels at each offset in the opposite direction from default offsetting.  The offsetting is also reversed when dealing with left-justified versus right-justified columns in order to make total occlusion of symbols from opposite sides of the column less likely.

310 Dimensions

The dimensions field contains three other comma-seperated values that are used in the layout of the graphical output, as follows:

<column width>, <height>, <offsets per row>

Column width is the number of symbols that may be placed in each column.  As with symbol size and pixel offset, this directly effects the total width of the graphical output.

Height is the number of rows of symbols that may be placed in a single graphic before exhausing the space.  This effects the total height of each graphic and inversely effects the total number of graphics that will be required to present the requested output.

Offsets per row is the number of offsets that can be distinguished in a particular row before total occlusion occurs.  Together with pixel offset, this controls how much extra space is reserved for each symbol to allow offsetting, also effecting the width and height of the graphical output.

3.11 Image Type

The image type field specifies which browser-supported image format to use, Jpeg or PNG.  The best choice is according to what your browser supports.  While JPeg is slightly older and may be supported in more browsers, PNG is more-suitable to the task because Jpeg is designed to compresses pictures containing continuously-varying shades and loses things in the process, which means that if you choose Jpeg, the graphics will take longer to download and the symbols will be fuzzier in color and position.

GIF is an older format that deals better with non-continuously-shaded graphics such as the output of this program, but after the format became popular, Unisys made it legally risky to use the format in a program, which is why the World Wide Web produced PNG (Portable Network Graphics) an unencumbered and more-advanced replacement that is supported by later versions of standards-compliant browsers.

3.12 None, Graphic, and Text Buttons

Pressing one of these buttons makes a request of the server to incorporate the current form fields into the URL (so they can be bookmarked), adding the specified display of events to the page.

If none is pressed, any graphic or text display of events is removed, which may be useful for bookmarking the undisplayed form.

If graphic is pressed, the graphical output of the program is displayed.

If text is pressed, a textual version of the output of the program is displayed.

4 Counter Expressions

A counter expression computes some number usually based on event counts.  This can be an event type, a pattern name, a number, a function, an automatically-converted condition, or a conditional counter.

4.1 Event Type

When used as a counter expression, the event type refers to the number of matching events in the current unit of time.

An event type is the actor, followed by dash, followed by the recipient, followed by dash, followed by the verb, as follows:

<source actor>-<target actor>-<verb>

The event type is aways interpreted after mapping the actors and codes, so the source and target actors must each be targets in the actor map field and the verb must be from the event codes field.  If, for example, the actor map produces source and target actors A and B and the code maps to the target verb MaterialCooperation, then the corresponding event type to refer to the count of such events is A-B-MaterialCooperation.

4.2 Pattern

A pattern is a name (with no dashes) that refers to the counter expression of a previously-declared pattern.

4.3 Number

A number is an integer constant that is returned instead of a counter.  This is most-useful as part of a more-complex counter expression, since it is boring to display the same number of symbols on every row of the display.  Asking for the sum of a constant over a time range will return the constant once for every time unit, because it behaves like other expressions during ranges, but as a constant value rather than as the result of counting something.

4.4 Condition

A condition is an expression which can be evaluated to be either true or false. It can be either a simple condition or a function condition.

A primitive condition consists of any counter expression, followed by the comparison, followed by another counter expression. It is possible to compare conditions for an exclusive-or-like effect. Conditions are converted before comparison, so compare to 1 for true or 0 for false.

A simple condition is any counter expression, followed by a comparor, (one of <, >, <=, >=, =, !=) followed by another counter expression. The comparor compares the counts of the two counter expressions and returns true or false. If a condition is directly used by a display, pattern, function, or other place where a counter is expected, it becomes a counter condition, which returns 1 or 0 for true or false.

If a counter is used where a condition is required, it is an error. A condition can compare a pattern to 0 to convert a condition converted to a counter as a pattern back into a condition.

patA>0

4.5 Function

A function is a well-recognized name, followed by an opening parenthesis, followed by one or more comma-seperated expressions, followed by a closing parenthesis.

sum(patA,patB)

A function may optionally be followed by brackets containing one or two colon-seperated numbers which indicate the offset (the first number, which may not be negative) and depth (the second number, which must be positive) of the desired value fields. The default offset is 0,indicating the currently computed unit and the default depth is 1 meaning look at the counter only in the current unit, not going back any additional units.

sum(patA,patB)[5:20]

A function (with or without bracketed numbers), may optionally be followed by a slash and a number, which causes the resulting value to be divided by the specified number, with any remainder discarded.

sum(patA,patB)[0:4]/4
A negative number causes the sign of the function result to be changed, which is useful for subtracting using the sum function.
sum(patA,patB,sum(patC,patD)/-2)[0:4]

4.6 Condition Function


Condition functions operate on condition expressions and produce conditions. The counter functions are all, any, and not, which compute the logical and, logical or, and opposite of conditions.

all(patA>=300,patA<=600,any(patA>patB,patA<1000),not(patB=1))

As with any condition, where a condition function is used as a counter expression, it becomes 1 or 0. It is an error to try to make a condition function operate directly on a counter expression such as a pattern, but a pattern can be easily incorporated via a simple condition.

any(patA>0,patB>0)

4.7 Counter Function

Counter functions operate on counter expressions and produce counter values. The counter functions are sum, min, and max, which compute the sum, minimum, and maximum of a set of counters.

min(sum(patA)[0:4],max(patB)[0:4])

4.8 Conditional Counter

Where conditions are used as a counter expression, they are automatically converted to 1 or 0 if they are true or false. It is possible to cause true to be converted to other values by following the condition with ? and then a number or other counter expression.

patA>3?patA

Other values for false are also possible by following the true value by : which is followed by the false value.

patA>patB?9:patC

4.9 More Expression Examples


sum(A-B-MaterialCooperation,A-B-VerbalCooperation)[5]

The sum of material and verbal cooperations from A to B five units of time before the current time.

sum(A-B-MaterialCooperation,A-B-VerbalCooperation)[0:10]

The sum of material and verbal cooperations from A to B during the ten units of time ending at the current time.

sum(A-B-MaterialCooperation,A-B-VerbalCooperation)[5:10]/3

The sum of material and verbal cooperations from A to B during the ten units of time ending five units of time before the current time, divided by 3.

min(A-B-MaterialCooperation,A-B-MaterialConflict)

The lessor of the material cooperation or the material conflict from A to B at the current time.

min(A-B-MaterialCooperation,A-B-MaterialConflict)[0:50]

The least count material cooperation or material conflict from A to B in the 50 days ending at the current time.

min(sum(A-B-MaterialCooperation,A-B-MaterialConflict))[0:50]

The least of the sums of material cooperation and material conflict in each day from A to B in the 50 days ending at the current time.

min(sum(A-B-MaterialCooperation,A-B-MaterialConflict)[3:5])[0:50]

The least of the 3-day-old 5-day sums of material cooperation and material conflict from A to B in the 50 days ending at the current time.

min(sum(A-B-MaterialCooperation,A-B-MaterialConflict)[0:5])[3:50]

The same interpretation as the previous example.

min(max(sum(A-B-MaterialCooperation)[0:1])[0:1])[0:1]

Since minimums, maximums, and sums of a single value are that value and an age of 0 and duration of 1 is just default interpretation in the current time, this expression should be simplified to the unadorned event type with no additional expression, i.e. A-B-MaterialCooperation.

min(A-B-MaterialCooperations)[5]/2

Which operator (min, max, or sum) was specified is irrelevant since it applies to a single event type and time unit, but some operator was required so that dividing and aging could be specified.

Note that the division makes little sense in this sort of case because the counter is at most 1.

min(A-B-MaterialCooperations)[0:5]/2

The operator is now relevant, because minimization is applied over a period of 5 units of time, the result of which is divided by 2.

min(1,sum(A-B-MaterialCooperations,B-A-MaterialCooperations)[0:5]/10)

Count the number of material cooperations between A and B in either direction over the latest 5 days, scaling by a tenth so that a symbol for 10 or more, but never display more than 1 symbol.

all(sum(B-A-MaterialConflict[5:5])>4,sum(A-B-MaterialConflict)[0:5]>10)

Return 1 symbol whenever at least 4 material conflicts from B to A spread over at most 5 units of time is followed by at least 10 material conflicts from A to B spread over at most 5 units of time.