Furthermore, since nothing is more perishable than an unused test set, there will be future UNIPEN calls for data.
The UNIPEN data format is free, and there exist public samples on:
ftp://ftp.nici.kun.nl/pub/UNIPEN/forum
1 (Belgium)
1 (Korea)
1 (Spain)
2 (Canada)
2 (France)
2 (Italy)
2 (Japan)
2 (Netherlands)
3 (UK)
4 (Israel)
6 (Germany)
14 (USA)
The most important rule is: good annotation. Describe the writer, the writing conditions, the devices used and their settings, the software, the content of the handwritten material, etc. The UNIPEN format offers many keyworded fields to include such information.
There is a PostScript versions of this paper.
"This is to announce:
open-unipen@unipen.nici.kun.nl
a mailing list for researchers who are interested in the UNIPEN
data format but who are not a member of the UNIPEN consortium.
Awaiting the release of the UNIPEN/NIST database for the public, the UNIPEN data format can already be used and processed by public domain software tools (http://unipen.nici.kun.nl/uptools3/). Some public data already exists: (ftp://ftp.nici.kun.nl/pub/UNIPEN/forum/).
The main entrance for UNIPEN in general is http://unipen.nici.kun.nl/
The open-unipen list allows for the exchange of ideas, data, and software. The directory ftp://ftp.nici.kun.nl/pub/UNIPEN/forum/ can be extended by uploads of data and software.
The unmoderated list is organized by Lambert Schomaker (schomaker@computer.org) at the Nijmegen Institute for Cognition and Information (NICI), The Netherlands.
Subscription to the open-unipen mailing list:
Send an E-mail message to:
Majordomo@unipen.nici.kun.NL
Containing the line:
subscribe open-unipen [your.email@adress]
We hope that this service will meet the interests of on-line handwriting recognition researchers who cannot get to the official UNIPEN data."
Archives of the open-unipen list are kept Here
| Benchmark | Description | ||||||||||||||||||||
1a | 1b | 1c | 1d | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
Note that only Benchmark #8 is a realistic, application-oriented test, because the word segmentation problem must also have been solved by the recognizer. No manual word segmentation is allowed in test Benchmark #8.
The benchmarks will be arbitrated by the National Institute of Standards in Technology (NIST) in the USA.
Read the UNIPEN Version 1.0 Definition
< PEN_STREAM > ::= <.PEN_DOWN> | <.PEN_UP>Numbering starts at zero.
Example
.SEGMENT WORD 0-4 ? "fives"
.PEN_DOWN
123 567
123 567
123 567
123 567
.PEN_UP
123 890
123 890
123 890
.PEN_DOWN
123 567
123 567
123 567
123 567
.PEN_UP
123 890
123 890
123 890
.PEN_DOWN
123 567
123 567
123 567
123 567