Newsgroups: comp.text.sgml Date: 01 Oct 1994 00:12:28 UT From: Tim Bray \ Organization: MIND LINK! Communications Corp., Langley, BC, Canada Message-ID: <36i9hc$6tg@deep.rsoft.bc.ca> References: <36bvg6$n9e@kaa.heidelbg.ibm.com> <1994Sep29.232903.10546@falch.no> <36hfm9INNp3b@afshub.boulder.ibm.com> Subject: Re: SGML editors and accented Latin characters I've said this here before at length, so I'll say it briefly. For *database* applications, I've had excellent success just storing the characters in their native encoding; I find that the ISO 8859 and JIS sets (and soon 10646) should cover the large and increasing majority of commercially important environments. Seems easier to convert ISO->some-stupid-code-page or whatever via a simple linear filter at delivery time, than dealing with the vagaries of character entities, which there are a lot of ways to screw up. And you still get to use all the nice SGML structuring machinery orthogonal to the issue of what's between the tags, so to speak. Some of the discussions here have suggested that where the application is heavily editing-oriented, especially in a heterogeneous environment, there are really good reasons for aggressively trying to use entities wherever possible. Would it be reasonable to speculate that entities are optimal for interchange, native encodings for repository & delivery? Cheers, Tim Bray, Open Text Corporation Newsgroups: comp.text.sgml Date: 01 Oct 1994 03:04:34 UT From: "Robert S. Kissel" \ Message-ID: <9410011204.AB09429@tubesteak.cis.yale.edu> Subject: HTML and SGML token replacement (Apologies in advance for what will, no doubt, seem a naive question to the veterans of this newsgroup, but I didn't know whom to ask.) As far as I can tell, there is no provision, in the DTD for HTML or HTML+, for user-defined tokens (and of course, no implementation in browsers to perform the substitution). Here is the goal I am aiming for: Suppose an HTML anchor is desired which points to one of several different sources, DEPENDING ON WHICH OF SEVERAL LOCATIONS THE BASE DOCUMENT IS STORED -- is there a solution, within HTML? Example: an HTML document is stored at two alternate sites; the HREF attributes of anchors vary, depending on which: [site a:] ... select \here\ ... [site b:] ... select \here\ ... One would like the source HTML to define all such URL constants at the top, somehow, to reduce the number of source copies, as one would in other programming languages. Of course, it is simple enough to use various filters to pre-process the HTML file; however, I was wondering if there is some construct in HTML which is intended to handle such situations, and which, due to naivete, I am overlooking. I believe (currently) that it is NOT possible (without altering the DTD of HTML), but if it is, I'd like to know. SGML, on the other hand, does afford more flexibility, and I'd like to know how such things are handled in DTD's (recommended style, etc.) from people more conversant with the syntax than I. Thank you for suffering my ingenuousness, and for your kind attention to this query. Newsgroups: comp.text.sgml Date: 01 Oct 1994 18:23:07 UT From: Charles Hurley \ Organization: PANIX Public Access Internet and Unix, NYC Message-ID: <36k9eb$1e3@panix.com> Subject: Can someone please send me a fully Compliant SGML file? To all concerned: I am in the process of learning SGML. Can a kind soul please send me a fully compliant SGML document. One that includes the three major parts of a SGML document: first the SGML Declaration, second the Document Type Declaration (set), and finally the Document Instance. I prefer one that does not call the various parts externally, but includes all subparts (i.e., the Reference Concrete Syntax, the markup declaration, etc.). Of all the books I read on SGML, I have never seen an example of a full SGML document. I seen many examples of DTD's, document instances, and of course the Reference Concrete Syntax, but I've never seen them put together into one complete SGML electronic document (file). This is what I seek. If you know of an FTP site where such files exist please tell me. However, please make sure that they do exist. All help in my education will be heartily appreciated. My e-address is below (and above I assume). Thank you all, C. Hurley PS: The document can be on any subject, from beer nuts to specifications on the Space Shuttle. -- Charles R. Hurley Phone: +1 212 633 3759 Publishing Technologies Coordinator (TeX Projects) Fax: +1 212 633 3685 Elsevier Science Inc. E-mail: churley@panix.com 655 Avenue of the Americas, New York, N.Y. 10010-5107 Newsgroups: comp.text.sgml Date: 01 Oct 1994 23:13:24 UT From: "Hugh J. Findlay" \ Organization: Vnet Internet Access, Inc. - Charlotte, NC. (704) 374-0779 Message-ID: <36kqek$8lj@ralph.vnet.net> References: <36k9eb$1e3@panix.com> Subject: Re: Help on SGML. [Charles Hurley] | I am in the process of learning SGML. Can a kind soul please send me a | fully compliant SGML document. One that includes the three major parts | of a SGML document: first the SGML Declaration, second the Document | Type Declaration (set), and finally the Document Instance. I prefer | one that does not call the various parts externally, but includes all | subparts (i.e., the Reference Concrete Syntax, the markup declaration, | etc.). Of all the books I read on SGML, I have never seen an example | of a full SGML document. I seen many examples of DTD's, document | instances, and of course the Reference Concrete Syntax, but I've never | seen them put together into one complete SGML electronic document | (file). This is what I seek. If you know of an FTP site where such | files exist please tell me. However, please make sure that they do | exist. All help in my education will be heartly appreciated. My | e-address is below (and above I assume). I'd be interested in seeing the same. Mind if I piggyback on this one? Like C. Hurley, I've seen pieces of today's SGML packages, but not all together (I have extensive BookMaster and tagged ASCII experience, but not Author-Editor/Avalanche/FrameBuilder...). Thanx. -- Hugh J. Findlay, Client Coordinator The Kelton Group Inc. 5540 Centerview Drive, Ste. 412 Raleigh, NC 27606 / Tel: 919.851.4064 / Fax: 919.859.3561 Newsgroups: comp.text.sgml Date: 02 Oct 1994 05:12:22 UT From: "W. Eliot Kimber" \ Organization: Passage Systems, Inc. Message-ID: \ References: <19940927T183458Z.erik@naggum.no> <36hekvINNp3b@afshub.boulder.ibm.com> Subject: Re: EMPTY proposal [Wayne L. Wohler] | Whichever solution is adopted, it must also address what may appear | between the start and end tags of an empty element. It isn't as | obvious (to me) as you might think (with my "shoot-from-the-hip" | suggestion: | | * inclusions No element or data allowed, so no | * pi's Yes, PI's are not data | * re, rs, space and other space function chars Yes But they would not be taken as data, obviously. | * comments Yes | * ignored marked sections Yes | * "empty" entity references not sure, need to refresh my memory on | where entity refs are valid General text entities yes, data or subdoc entities, no. Data and subdoc entities are only valid where character data is valid. -- \
W. Eliot Kimber (kimber@passage.com) Systems Analyst and HyTime Consultant Passage Systems, Inc., 9971 Quail Blvd., Suite 903, Austin TX 78758 +1 512 339 1400 465 Fairchild Dr., Suite 201, Mountain View, CA 94043, +1 415 390 0911 \
Newsgroups: comp.text.sgml Date: 02 Oct 1994 06:19:37 UT From: Dan Connolly \ Organization: HaL Computer Systems, Inc. Message-ID: \ References: <9410011204.AB09429@tubesteak.cis.yale.edu> Subject: Effectivity techniques [Was: HTML and SGML token replacement] [Robert S. Kissel] | (Apologies in advance for what will, no doubt, seem a naive question to | the veterans of this newsgroup, but I didn't know whom to ask.) [There are no stupid questions... It's a shame that this group seem so intimidating for so many people. I'm as guilty as anybody of biting folks in the head for posting nonsense, I guess. But for the record: questions are welcome. Correct answers, or honest attempts at them, are nice too. Statements of apparent fact without evidence, citations, or supporting arguments are not.] | As far as I can tell, there is no provision, in the DTD for HTML or | HTML+, for user-defined tokens (and of course, no implementation in | browsers to perform the substitution). Here is the goal I am aiming | for: Suppose an HTML anchor is desired which points to one of several | different sources, DEPENDING ON WHICH OF SEVERAL LOCATIONS THE BASE | DOCUMENT IS STORED--is there a solution, within HTML? Short answer: nope. | Of course, it is simple enough to use various filters to pre-process | the HTML file; If you're after a practical solution to a real-life problem, "pre-processing" is it, for now. Of course, this doesn't preclude the possiblity of using some of SGML's more expressive features such as marked sections in the "source" version of your documents, and then down-translating to "practical HTML" before shipping the data over the wire. For example: \ ]> \News on Quantum Cryptography\ \

Quantum Cryptography\

\
Newsgroups: comp.text.sgml Date: 04 Oct 1994 16:19:55 UT From: "Wayne L. Wohler" \ Organization: IBM Information Development Strategy and Tools Message-ID: <36rvbbINNomn@afshub.boulder.ibm.com> References: \ <1994Sep27.154444.16794@ast.saic.com> \ <36bvg6$n9e@kaa.heidelbg.ibm.com> \ Subject: Re: Multilingual HTML, SGML documents? [Peter Kerr] | What about the reader of the document? | He/she needs the same character map. That is a virtue of using character entity sets to interchange these "maps", they are interchanged as part of the document since they are referenced from the doctype subset or DTD. The "reader" (the program NOT the user) can then resolve these entity references as best it can, which will differ depending on the hardware/software's capability for rendering the glyphs in question. | ISO 10646 compliance could be a systematic way to ensure that | authoring/browsing systems work properly. Yup. -- Wayne L. Wohler Internet: wohler@vnet.ibm.com Dept G82/025Z IBMMAIL: USIB29WX@IBMMAIL Information Development Strategy and Tools Phone: +1 303 924 5943 IBM Corporation PO Box 1900 Boulder, Colorado 80301-9191 Newsgroups: comp.text.sgml Date: 04 Oct 1994 16:33:13 UT From: "Wayne L. Wohler" \ Organization: IBM Information Development Strategy and Tools Message-ID: <36s049INNomn@afshub.boulder.ibm.com> References: <36bvg6$n9e@kaa.heidelbg.ibm.com> <1994Sep29.232903.10546@falch.no> <36hfm9INNp3b@afshub.boulder.ibm.com> <36i9hc$6tg@deep.rsoft.bc.ca> Subject: Re: SGML editors and accented Latin characters [Tim Bray] | I've said this here before at length, so I'll say it briefly. For | *database* applications, I've had excellent success just storing the | characters in their native encoding; I find that the ISO 8859 and JIS | sets (and soon 10646) should cover the large and increasing majority of | commercially important environments.x I'll bet this does work "most" of the time. For many people and organizations all of the time. Does this not depend on a) what kind of information is being stored (what languages, etc.) and b) what the database is used for? I'm not arguing that it is a "wrong" approach but that, like always, you had better understand what you are doing to make the correct implementation tradeoffs. In a way, this is a variation of what I mentioned about editing. You can store it in the database using 8859, etc., and when needed, pull it out and convert to character entities for transmission or processing by programs where the character entities are needed. | Would it be reasonable to speculate that entities are optimal for | interchange, native encodings for repository & delivery? It is certainly necessary for interchange across 7 bit channels. It could also be useful for systems that don't use 8859 for accented chars but, say, use char, deadkey, accent or some other encoding for presentation. Admittedly, if you can handle the character entity mapping, you can probably map the 8859 codepoint too but the entity mechanism is there already and doesn't need to be invented and implemented. -- Wayne L. Wohler Internet: wohler@vnet.ibm.com Dept G82/025Z IBMMAIL: USIB29WX@IBMMAIL Information Development Strategy and Tools Phone: +1 303 924 5943 IBM Corporation PO Box 1900 Boulder, Colorado 80301-9191 Newsgroups: comp.text.sgml Date: 04 Oct 1994 17:24:17 UT From: Rick Jelliffe \ Message-ID: \ Subject: response to comments on miniSGML I've had generally favorable email about the miniSGML grammar. [Bob Agnew] | I suggest you rename it to "microSGML" or perhaps "nanoSGML". | In your language the Tags have no meaning or allowable content. I would happily name it picoSGML or even subquarkal-pseudoSGML if that would make it's purpose and scope understandable. The small size is the whole point: there are many uses for generic markup for which SGML is too big and too obscure. But if, instead of inventing your own syntax for these uses, you use the same tagging system as SGML (e.g., a subset scanner grammar like miniSGML), then your data magically also is SGML and can be manipulated by real SGML systems as well. In the SGML handbook, Goldfarb's comments about SGML tagging design says (from memory) that long names are better than fixed, and that explicit ending delimiters are better are good too: miniSGML keeps both. And, in SGML tag names have no meaning too. They are not given meaning by the DTD, merely structural interpretation. And my tags do have allowable content: anything that the equivalent SGML document's DTD would allow. Its just that I don't do any checking, and make some explicit convention; but every real SGML system has extra-standard conventions too. That REs in data content will be replaced by spaces in the application (if it performs its own line-breaking) is one for example, but one outside the ambit of the standard (as I mentioned, Interleaf doesn't follow this one). Another convention widely required is for maximum line lengths: some processing systems have restrictions in this. Since these type of conventions are commonplace and needful, my little restrictions on when newlines should not be generated, even if allowed by the grammar, are not unusual in kind, just explicit. [Bob Agnew] | to format attributes why not put an empty element called | \ in the DTD and tell the FOSI what to do. This seems a rather grand approach, so I think what we have here is a failure to communicate. Are you saying that SGML parsers are easy to implement if you also implement a FOSI processor too? My friends often tell me that less complicated things are generally more simple, but perhaps that is not always the case. The intended users of miniSGML are not able to call down a real SGML parser and FOSI system deus ex machina like that. If I need to quickly make a version of SED that can accept some kind of SGML, I don't want to have a version of SED that requires its SGML data marked-up in SGML and with a FOSI processor and instance required too, do I? the SGML handling code would be larger than the SED handling, and the FOSI handling would be too. All that work just to, say, substitute one word for another is hardly warranted. Warren Baird wondered what "single page grammar" meant. I meant to say "single-page grammar". I hope that makes it clear. Small is beautiful. Warren also likes Empty elements. But an empty element is the equivalent of an normal element which doesn't happen to have any content in the instance. Since I make the brave assumption that data is valid (I suppose this means not generated by a human, in practise) there is no problem. (In fact, I think the Standard says that Empty elements pass both a start and end tag to the application in the ESIS (in the annex or appendix on ESIS?)). Perhaps miniSGML is an ESIS markup language, then? -ricko -- Rick Jelliffe Electrical Engineering, University of Technology, Sydney Allette Systems Sydney, Australia Newsgroups: comp.text.sgml Date: 04 Oct 1994 17:53:36 UT From: Ralph Ferris \ Organization: Fujitsu Open Systems Solutions, Inc. Message-ID: <36s4r0$p0c@yucca.ossi.com> References: \ Subject: Re: What DTD to use for Man-Pages? [Chris Hector] | I want to put some man pages into SGML and I am looking for a DTD. I | would like something tailored toward traditional UNIX man pages rather | than an all-encompassing DTD. | | Does anyone have a favorite - or a place that I can look for DTD's? Manual page markup is thoroughly covered in the DocBook DTD, which has become the de facto standard for software technical documentation. DocBook was originally developed by HaL Computer Systems and O'Reilly & Associates (who hold the copyright); anyone can use it freely provided the copyright notice is included. DocBook's development is under the control of the Davenport group. I suggest getting on their alias to keep up with developments. You can get information on how to do so by logging into ftp.ora.com via anonymous ftp and following the instructions in ./pub/davenport/toget.docbook -- Ralph E. Ferris Project Manager, Electronic Publications Fujitsu Open Systems Solutions, Inc. Engineering Services Phone: +1 408 456 7806 E-mail: ralph@ossi.com Newsgroups: comp.text.sgml Date: 04 Oct 1994 18:10:16 UT From: Rick Jelliffe \ Message-ID: \ Subject: "was Re: multilingual HTML, SGML documents" \ Don Connley has queried if shift-JIS is valid SGML, as such. (Shift-JIS is a Japanese encoding widely used on desktop computers.) I think no. The SGML character sets must be a fixed number of bytes, but allow multicode character sets through code extension (escape characters). (In Ken Lunde's terms fixed and modal) This takes a bit of effort to take the Standard in any other way, I think. But shift-JIS is really a 1 bit tag paired with 7 bits of data. The tag bit tells how to interpret the current 7 bits and the next 7 bits (going between two good states + possibly 1 error state). (In Lunde's terms non-modal.) I don't think ISO 2022 really deals with longer character sizes: I haven't seen it, but the impression I get (e.g., from its title) is that is is concerned with 7 and 8 bit character codes. So shift-JIS is not really an SGML-friendly character coding: the SGML declaration and some other things (character entity references) for example, are a little bit hard to use in that regard. If you convert to JIS or fixed-EUC first however, all the conceptual problems go away. I am looking at this area at the moment. Is there any Japanese who can shed light on this? -ricko -- Allette Systems Sydney, Australia Newsgroups: comp.text.sgml Date: 04 Oct 1994 20:33:55 UT From: Casey Scalzi \ Organization: "Hitachi Computer Products (America), Inc. Message-ID: <36se7j$d7a@sunfish.hi.com> Subject: SGML sources Are there any books available that might get me started in SGML. I have a copy of the docbook DTD but would like something a little more textbookish. thanks casey (cscalzi@hi.com) Newsgroups: comp.text.sgml Date: 05 Oct 1994 03:54:50 UT From: Ed Aborn <22acacia@cftnet.com> Organization: A3 Grafx Message-ID: \ Subject: Shareware SGML viewers? Please excuse this question if the answer is obvious, but I am new here. Are there any Shareware or Public Domain SGML viewers available for either the UNIX and/or Windows platforms? Thanks in advance for any information you might have. Ed Aborn - 22acacia@cftnet.com Newsgroups: comp.text.sgml Date: 05 Oct 1994 11:40:44 UT From: John Turner \ Organization: Hughes Aircraft of Canada Limited, Calgary AB Message-ID: \ Subject: Microstar Has anyone heard of a company called Microstar? Apparently they have a graphical interface to SGML. I would appreciate any information about their product. Jt -- _ _ /: Thbbft. \\'o.O; _/ =(___)= |U| Newsgroups: comp.text.sgml Date: 05 Oct 1994 14:30:00 UT From: "G. Ken Holman" \ Message-ID: \ References: <19940909.4927@naggum.no> \ Subject: Re: 5870 - Canadian participation in ISO/IEC JTC1/SC18/WG8 [Erik Naggum] | get involved in WG8 before it is too late! it is not only your data, | it is the whole framework within which you design your applications | that is at stake [Dr. James D. Mason Convenor, ISO/IEC JTC1/SC18/WG8] | I heartily second Erik's plea for participation. We need more than | just the few hardcore faithful who have stuck with the project for | years on end. It would be good to have more participants from both the | computer-science community and the user community working on the | project. | ... | Active participation in WG8 comes through participation in national | standards bodies (e.g., ANSI, BSI, DIN, AFNOR, TSCJ, NNI) that certify | participants to the working group. To interested Canadian parties, our national standard body to JTC1 is the Standards Council of Canada (SCC). I am currently the chair of the Canadian committee CAC/ISO/IEC JTC1/SC18/WG8 Document Processing and Related Communication - Document Description and Processing Languages. There are 11 people from 9 organizations participating in one way or another right now and we can always use more input. Please contact me if you are interested in joining in. .............. Ken -- G. Ken Holman | Phone: +1 613 727-5696 x317 Vice President, R\&D | Fax: +1 613 727-9491 Microstar Software Ltd. | WATS: 1 800 267-9975 34 Colonnade Road North | E-mail: gkholman@microstar.com Nepean Ontario CANADA K2E-7J6 | Newsgroups: comp.text.sgml Date: 05 Oct 1994 14:37:31 UT From: Bob Agnew \ Organization: Science Applications International Corp. Message-ID: <1994Oct5.143731.20589@ast.saic.com> References: <36p76b$c1@erinews.ericsson.se> Subject: Re: SGML to Postscript formatter [Peter Witt] | I am looking for a high speed formatter that format SGML documents | instances with graphics and tables to Postscript. There are some | products around on the market that I have seen, but I am not sure if I | have seen all of them. One of the problems when it comes to formatting | in a "production formatting/printing machine" is the lack of standards | for formatting data since the "production machine" may be a different | tool than the one in which the document was originally created. Does | anyone have any information concerning this? There are two prevalent standards for the formatting of SGML tagged data, FOSIs and DSSSL. FOSIs are governed by the outspec DTD in Mil-Std-28001B Appendix B. FOSIs are supported by Arbortext and DataLogic. DSSSL? -- "If it moves salute it; if it doesn't move \ it." Newsgroups: comp.text.sgml Date: 05 Oct 1994 15:53:00 UT From: Cheryl Conway \ Message-ID: <9409067814.AA781434242@njcorp.akbs.com> Subject: Anyone have any info about WORD's recent SGML release? I've been passed a news release regarding MicroSoft WORD (Redmond, WA, Sept. 13) that says: "Microsoft Corporation (Nasdaq: MSFT) today announced Microsoft(R) SGML Author for Word, an easy-to-use add-on tool for Microsoft Word 6.0 for the Windows(TM) operating system for creating Standard Generalized Markup Language (SGML). SGML Author hides the complexities of SGML, allowing users to create SGML quickly without extensive training or knowledge." => Does anyone know anything about this product? The release goes on to say: "Users no longer have to spend hours manually tagging documents -- SGML Author does it for them automatically." => Particularly, I am curious about how this automatic tagging is => done. Any input would be greatly appreciated! Cordially, -- Cheryl (Ventura) Conway Advantage KBS 1 Ethel Road, Suite 106B Edison, NJ 08817 phone -- +1 908 287 6494 x 253 FAX -- +1 908 287 3193 email -- cconway@akbs.com Newsgroups: comp.text.sgml Date: 05 Oct 1994 19:18:00 UT From: Graham Tritt \ Organization: Swiss Federal Government Message-ID: <1994Oct5.201800.119@gw2.admin.ch> References: <36bvg6$n9e@kaa.heidelbg.ibm.com> <36i9hc$6tg@deep.rsoft.bc.ca> Subject: Re: SGML editors and accented Latin characters [Tim Bray] | Would it be reasonable to speculate that entities are optimal for | interchange, native encodings for repository & delivery? Sounds reasonable, but there are more processing activities than interchange, storage and delivery. A major problem is sorting - where \ä is placed congruent with a or with ae, depending on whether you use the Swiss or the German telephone book. And not, PLEASE, after z! Such problems have been solved by database manufacturers such as Oracle and by companies concerned with natural language internationalization, such as in HP Unix. It seems also that we are again trying to solve the problem of square wheels, as printcaps, termcaps, terminal emulators, and many word-processors have tried for a long time. Remember also, that language/character set must be a local switch, for documents which contain quotes or titles in different languages. Even for text flowing temporarily in reverse direction. My solution is to put all the language pieces in database records (after converting the entities to the database character representation) and have an attribute of language for sorting. We have to manage German, French and Italian versions. Any suggestions for alternative methods are welcome -- Graham Tritt Swiss Federal Office of Information Technology and Systems Information and Documentation Center graham.tritt@ste3.bfi.admin.ch Newsgroups: comp.text.sgml Date: 05 Oct 1994 19:53:46 UT From: Peter Flynn \ Organization: CERN European Lab for Particle Physics Message-ID: \ References: <36p76b$c1@erinews.ericsson.se> Subject: Re: SGML to Postscript formatter [Peter Witt] | I am looking for a high speed formatter that format SGML documents | instances with graphics and tables to Postscript. There are some | products around on the market that I have seen, but I am not sure if I | have seen all of them. One of the problems when it comes to formatting | in a "production formatting/printing machine" is the lack of standards | for formatting data since the "production machine" may be a different | tool than the one in which the document was originally created. Does | anyone have any information concerning this? What about SGML-->TeX-->PostScript? I wrote a pilot SGML2TeX which you can pick up by ftp in pub/sgml on curia.ucc.ie. At least using TeX overcomes any portability problems. ///Peter pflynn@curia.ucc.ie Newsgroups: comp.text.sgml Date: 05 Oct 1994 20:05:46 UT From: Peter Flynn \ Organization: CERN European Lab for Particle Physics Message-ID: \ References: <1994Oct3.173542.1@east.pima.edu> Subject: Re: SGML readers for Oxford classics (TEI)? [Gloria McMillan] | 2. I'm at the point where I have downloading and learning the TEI | encodings section by section. In the TEI GUIDELINES menu choices | some readers are mentioned. I don't know where one can get them. I'm not sure what you mean by "reader", but if you want to see TEI texts you can use any normal text editor but you'd have to sort your way thru the tags by eye. If you want a pseudo-WYSIWYG kind of interface, you'll have to buy one, I'm afraid. | 3. I am on a VAX VMS system at my school. Someone suggests that I | might be able to do some types of document searches just by getting | the TPU EVE add-on (the sysadmin of my node could get that.) Does | that sound reasonable? I believe TPU using the SGML LSE does work, but I've never tried it. A VAX running VMS (in line mode?) is probably not a good system for this: a PC or Mac might be better, but you're still going to have to buy the software, unless you join the club and pursue the "convert-to-HTML" line so you can use any WWW browser for display. But you usually forgo the richness of markup which is what the TEI is all about. | Any ideas for me on SGML readers? I welcome any comments. You can | write my e-mail address if you need more info from me. I wish I had better news. I use TEI a lot for a project, but we had the money to buy software. ///Peter Newsgroups: comp.text.sgml Date: 05 Oct 1994 20:21:08 UT From: Peter Flynn \ Organization: CERN European Lab for Particle Physics Message-ID: \ References: <19940909.4927@naggum.no> \ Subject: Re: Canadian participation in ISO/IEC JTC1/SC18/WG8 [Dr. James D. Mason Convenor, ISO/IEC JTC1/SC18/WG8] | I heartily second Erik's plea for participation. We need more than | just the few hardcore faithful who have stuck with the project for | years on end. It would be good to have more participants from both the | computer-science community and the user community working on the | project. | ... | Active participation in WG8 comes through participation in national | standards bodies (e.g., ANSI, BSI, DIN, AFNOR, TSCJ, NNI) that certify | participants to the working group. I tried this. My Nat Stds Bod only gets funding to send its own staff to meetings, so volunteers cannot go unless their own orgs support them (mine won't). They also will not supply copies of docs for volunteers, as they are not allowed to make copies, don't have the time and money, and we're supposed to buy them individually. Now, none of this is ISO's "fault" (well, maybe it is a bit) rather, it's my govmt's baby, and they're the usual crowd of no-hopers and misfits. Question: how do willing volunteers participate in these circumstances? ///Peter Ignore the posting address, I'm at pflynn@curia.ucc.ie Newsgroups: comp.text.sgml Date: 05 Oct 1994 22:47:32 UT From: Erik Naggum \ Organization: Naggum Software; +47 2295 0313 Message-ID: <19941005T224732Z.erik@naggum.no> References: \ Subject: Re: Shift-JIS (was Re: multilingual HTML, SGML documents) [Rick Jelliffe] | Don Connley has queried if shift-JIS is valid SGML, as such. | | I think no. The SGML character sets must be a fixed number of bytes, | but allow multicode character sets through code extension (escape | characters). This takes a bit of effort to take the Standard in any | other way, I think. the character set scheme in SGML is inherently flawed. the techniques explicitly covered in the standard are insufficient outside of very simple character codings, and they effectively forbid any extensions. ISO 2022 is not really supported, and the subset of code extensions that is supported does not cover any substantial practice in the field. those who wanted support for ISO 2022 and code extension at the time didn't really know what they wanted, and accepted something that didn't solve their problem. as it stands, it would have been better had SGML not had its character set declarations at all. the connection between the SGML declaration and the characters actually found in the entities that are read is already significantly weakened by the fact that the entity manager must convert line endings into record boundaries. all computer languages that wish to be used outside of a single (type of) system must cater to the diversity of line endings, and this SGML does well in principle, but we still have a vestige of "codes" for the RS and RE "characters" when they really are events like the Entity end "signal". this is very unfortunate, and leads to a number of other minor problems. the main problem is this: if the entity manager is free to translate a small set of codes to other codes, and the SGML parser is only concerned with what it receives from the entity manager, then the entity manager might as well translate everything into a uniform code that the SGML parser would be happy with. the SGML parser is only concerned with the codes that means something for the parsing process, anyway, and the application gets whatever the SGML parser considers to be "not markup". it should be obvious that this scheme leaves a lot to be desired in terms of identifying the codings involved. we note that ESIS does not contain any provisions for identification of character sets. again, ESIS is _not_ a good idea outside of conformance testing, for which it was defined. to solve this problem, it is obvious that we have to sever the link between the SGML declaration and the actual characters in the files (or whatever). that is, each file (etc) read must identify its character sets somehow, and the entity manager must then be responsible for translation of this source character set into whatever the SGML parser will be happy with. at this stage, it is inconsequential how the coding at the file level looks, as long as the SGML parser is happy. the result is, however, that the SGML declaration's character set stuff is inconsequential, as well. | But shift-JIS is really a 1 bit tag paired with 7 bits of data. The | tag bit tells how to interpret the current 7 bits and the next 7 bits | (going between two good states + possibly 1 error state). shift-JIS is relatively easy to decode. it provides a 14-bit encoding to the application, and it would take only minimal effort to teach an SGML declaration to accept the codes in which the (ASCII) characters required by SGML are coded. the translation work must then be performed by the entity manager, and the SGML parser must be able to handle more than 8-bit characters, and pass them through to a cooperating application. (not all SGML parsers are able to do this without serious hacking, and they aren't written in programming languages that would enable users to customize them, either.) | I don't think ISO 2022 really deals with longer character sizes: I | haven't seen it, but the impression I get (e.g., from its title) is | that is is concerned with 7 and 8 bit character codes. I was unimpressed when Ken Lunde told me that he had never seen ISO 2022, much less the ISO 2375 character set registry. (I attended an event at the Computer Literacy Bookstore in San Jose, CA, where Ken Lunde gave a talk when his book Understanding Japanese Information Processing came out.) I remain unimpressed when people who talk about character encodings have not read the basic documents. they aren't that expensive, although they are very complicated, so nobody who tries to summarize them get them right enough that readers of the summaries don't get seriously confused. ISO 2022 allows everything from one to four-byte encodings schemes, but unfortunately only multiple 7-bits-per-byte encodings. there are actually provisions for even longer sequences. the encoding schemes are especially well suited to 7- and 8-bit encodings in a single byte, but more is available. #\ -- Microsoft is not the answer. Microsoft is the question. NO is the answer. Newsgroups: comp.text.sgml Date: 06 Oct 1994 09:24:53 UT From: Christoph Altenhofen \ Organization: IBM Germany, European Networking Center, Heidelberg Message-ID: \ References: \ Subject: Re: Microstar [John Turner] | Has anyone heard of a company called Microstar? Apparently they have a | graphical interface to SGML. I would appreciate any information about | their product. I suppose you think of Near\&Far, a tool to handle DTDs in a graphical manner. I saw a demo on the CeBit in Hannover this spring, but didn't take a closer look at it. In my opinion it's a good tool to visualize the dependencies and structure of a DTD, but I don't think that creating a DTD is made so much easier as the traditional way, coding DTDs by hand. Their Adress: Microstar Software LTd. 34 Colonnade Road North Nepean, Ontario Canada K2E 7J6 Phone: (613) 727-5696 or (800) 267-9975 Email: cade@msl.isis.org Hope this helps, Christoph -- * Christoph Altenhofen / IBM * European Networking Center * \\Vangerowstr. 18 * D-69115 Heidelberg * Tel.: +6221 / 59 - 4503 \\_____________ Germany * e-mail : CALTENHOFEN at VNET.IBM.COM \\______________________/ * christo@heidelbg.ibm.com * X-400 : C=DE;A=IBMX400;P=IBMMAIL;S=ALTENHOFEN;G=ALTENHC Newsgroups: comp.text.sgml Date: 06 Oct 1994 09:37:25 UT From: Joachim Schrod \ Organization: TH Darmstadt, FG Systemprogrammierung Message-ID: <370ggl$mqp@rs18.hrz.th-darmstadt.de> References: <36bvg6$n9e@kaa.heidelbg.ibm.com> <36i9hc$6tg@deep.rsoft.bc.ca> <1994Oct5.201800.119@gw2.admin.ch> Subject: Re: SGML editors and accented Latin characters [Tim Bray] | Would it be reasonable to speculate that entities are optimal for | interchange, native encodings for repository & delivery? [Graham Tritt] | Sounds reasonable, but there are more processing activities than | interchange, storage and delivery. A major problem is sorting - where | \ä is placed congruent with a or with ae, depending on whether you | use the Swiss or the German telephone book. And not, PLEASE, after z! The umlauts are really a nice example: Not only they are sorted differently in Swiss and German phone books -- they are sorted differently in German and German books. I.e., the sort order depends on the circumstances and the expected audience. So please don't attach one sort order to `German'. It's reasonable to attach one order per processing of one (sub-)document, though. Joachim -- Joachim Schrod Email: schrod@iti.informatik.th-darmstadt.de Computer Science Department Technical University of Darmstadt, Germany Newsgroups: comp.text.sgml Date: 06 Oct 1994 11:38:45 UT From: Gary Houston \ Organization: Actrix Information Exchange Message-ID: \ References: \ <1994Sep26.233841.22334@midway.uchicago.edu> Subject: Re: Multilingual HTML, SGML documents? [Dan Connolly] | But there is no specified way to write a document in Russian, let along | Japanese. This issue must be addressed _very_ soon. [Richard L. Goerwitz] | There are Mosaic clients specially made (read: hacked) to autodetect | which of the ISO 8859 charsets a doc is using (one of which includes | Cyrillic, another Greek, another Hebrew, another Arabic, etc.). This | is not a good solution, but the very fact that they exist testifies to | some demand. So why does HTML need to specify a character set, really? You could just as well use the transport layer, i.e., the MIME header in HTTP: Content-type: text/html; charset=iso-8859-1 This could be set to the character set of the local computer system, when reduction to us-ascii is not practical. And don't worry about the SGML parser. Try something like this in your SGML declaration: CHARSET BASESET "ISO 646:1983//CHARSET International Reference Version (IRV)//ESC 2/5 4/0" DESCSET 0 9 UNUSED 9 2 9 11 2 UNUSED 13 1 13 14 18 UNUSED 32 95 32 127 1 UNUSED BASESET "ANY//CHARSET Extended Character Sets Permitted//ESC 2/5 4/0" DESCSET 128 128 "CHAR" I have never seen a programming language which required a character set declaration. The same technique can be applied to SGML, as Tim Bray has already pointed out. Newsgroups: comp.text.sgml Date: 06 Oct 1994 12:42:43 UT From: Werner de Bruijn \ Organization: TNO Building and Construction Research Message-ID: <1994Oct6.124243.4657@hermes.bouw.tno.nl> Subject: Printing HTML documents with SGML sw?? Does it require much effort to print HTML documents with SGML viewers. Mainly we want page numbering and maybe some additional company specific page headers. Thanks, Werner. PS: Any recommendation for public domain or low cost SGML viewers? -- Werner de Bruijn TEL : (31) 15-842040 TNO-Building and Construction Research FAX : (31) 15-122182 P.O. Box 49 INTERNET : W.deBruijn@bouw.tno.nl 2600 AA Delft, the Netherlands Newsgroups: comp.text.sgml Date: 06 Oct 1994 14:23:31 UT From: "Richard L. Goerwitz" \ Organization: University of Chicago Message-ID: <1994Oct6.142331.11581@midway.uchicago.edu> References: \ <1994Sep26.233841.22334@midway.uchicago.edu> \ Subject: Re: Multilingual HTML, SGML documents? [Richard L. Goerwitz] | There are Mosaic clients specially made (read: hacked) to autodetect | which of the ISO 8859 charsets a doc is using (one of which includes | Cyrillic, another Greek, another Hebrew, another Arabic, etc.). This | is not a good solution, but the very fact that they exist testifies to | some demand. [Gary Houston] | So why does HTML need to specify a character set, really? You could | just as well use the transport layer, i.e., the MIME header in HTTP: | | Content-type: text/html; charset=iso-8859-1 You're thinking about this like an American. I can sympathize, being one too. The fact is that huge portions of the world's population are multilingual, and it's not untypical for them to mix two languages in a single document (sometimes more). Many servers based in countries where English is not the national language, for example, are bilingual (take a look, e.g., at some of the Israeli Web sites). Historians, philolo- gists, literary scholars, etc. also often quote their sources in the original languages. Biblical scholars customarily quote Greek, Hebrew, and sometimes several other languages, in their papers and commentaries. Some countries even require that everything be bilingual. Others often change back and forth between English and a local language (e.g. India). Every student of a foreign language also has to use bilingual dictionar- ies, as you probably found in college. Such tools will find increasing use on a truly worldwide Web. What we need is a way to jump arbitrarily from one language to another and from one encoding scheme to another. This would seem to imply mov- ing language/encoding tags out of the transport layer and into the actual HTTP definition. Historically US programmers have failed to see very far down the road to multilingual environments. Most folks I talk to don't really even un- derstand the difference between internationalization/localization and multilinguality. Others are so ignorant of "foreign" writing that they do not even know that languages and fonts are different beasts (e.g., they fail to recognize that Arabic runs right-left and that special conventions apply to mixing it with English text). It's *not* because we in the US are pathologically stupid. We just live in large and basically monolingual culture. There's really no need to learning anything but English in the US. I would guess, in fact, that while European engineers snipe at their US counterparts for failing to take their languages into account, they (as I noted in another posting) are making the same mistake themselves for languages *they* don't know. -- -Richard L. Goerwitz goer%midway@uchicago.bitnet goer@midway.uchicago.edu rutgers!oddjob!ellis!goer Newsgroups: comp.text.sgml Date: 06 Oct 1994 15:28:19 UT From: Bob Agnew \ Organization: Science Applications International Corp. Message-ID: <1994Oct6.152819.24336@ast.saic.com> References: \ Subject: Re: response to comments on miniSGML [Bob Agnew] | I suggest you rename it to "microSGML" or perhaps "nanoSGML". | In your language the Tags have no meaning or allowable content. [Bob Agnew] | to format attributes why not put an empty element called | \ in the DTD and tell the FOSI what to do. [Rick Jelliffe] | This seems a rather grand approach, so I think what we have here is a | failure to communicate. Are you saying that SGML parsers are easy to | implement if you also implement a FOSI processor too? My friends often | tell me that less complicated things are generally more simple, but | perhaps that is not always the case. The intended users of miniSGML | are not able to call down a real SGML parser and FOSI system deus ex | machina like that. Anything but grand. You will wind up doing the same thing, only in a framework which will subvert the whole purpose of SGML. I typeset my newlines in my FOSI as a non-printing character on a line. It's real easy for me to do that. Have you ever considered the requirements for a FOSI engine or its equivalent, a style sheet processor? -- "If it moves salute it; if it doesn't move \ it." Newsgroups: comp.text.sgml Date: 06 Oct 1994 15:35:34 UT From: Norbert Winklareth \ Organization: Exoterica Corporation Message-ID: <1994Oct6.153534.22105@exoterica.com> References: \ <1994Sep26.233841.22334@midway.uchicago.edu> \ Subject: Re: Multilingual HTML, SGML documents? [Gary Houston] | I have never seen a programming language which required a character set | declaration. The same technique can be applied to SGML, as Tim Bray | has already pointed out. SGML is not a programming language, it has no semantics. This is it's major strength and weakness. Unfortunately most people both over/under-estimate what SGML is capable of doing. SGML is about input representations, that is why character sets are an issue. How about a comp.text.sgml in joke? (hi #\ I am waving my red flag again /:-) ) regards -- Norbert Winklareth email: me -- nw@exoterica.com Exoterica Corporation Product Info -- info@exoterica.com Tech Support -- support@exoterica.com Newsgroups: comp.text.sgml Date: 06 Oct 1994 18:08:24 UT From: Warren Baird \ Organization: University of Waterloo Message-ID: \ References: <36p76b$c1@erinews.ericsson.se> Subject: Re: SGML to Postscript formatter [Peter Witt] | I am looking for a high speed formatter that format SGML documents | instances with graphics and tables to Postscript. There are some | products around on the market that I have seen, but I am not sure if I | have seen all of them. One of the problems when it comes to formatting | in a "production formatting/printing machine" is the lack of standards | for formatting data since the "production machine" may be a different | tool than the one in which the document was originally created. Does | anyone have any information concerning this? The system we're using here consists of a shell script that calls sgmls and sgmlsasp, and optionally calls a perl script with the output of sgmlsasp to polish the output. We either go directly from SGML to HTML for online viewing, or go from SGML to LaTeX, and then from LaTeX to PostScript for printing. Going directly from SGML to Postscript is feasible, but probably not practical. With our system, going from SGML to HTML or LaTeX is quite fast, but going from LaTeX to PS is as slow (or as fast) as running LaTeX normally is. Warren Newsgroups: comp.text.sgml Date: 06 Oct 1994 19:38:20 UT From: Simon North \ Message-ID: <31@direct.iaf.nl> References: <19940909.4927@naggum.no> \ \ Subject: Re: Canadian participation in ISO/IEC JTC1/SC18/WG8 I'm afraid I have to echo Peter's sentiments and his request. In The Netherlands the national standards organisation (the NNI) has had almost all of its government subsidies scrapped and is forced to try and support itself. The company I work for has had the foresight, wisdom and understanding to support my participation in a standards commission for the past two years while we try to define some basic acceptance criteria for the manuals supplied with consumer products. This means that my time has been paid for and all my travelling expenses. Now the NNI is threatening to cancel the whole activity if the commission cannot provide enough funds to pay their costs (about $2000 per year)... Since our committment is predominantly personal (my employer is a defense manufacturer and is frankly almost completely uninterested in the standard itself) it looks as though a wonderful effort is going to go down the plug. As for ISO partipation ... forget it. While I can swing the odd day off (plus several days from my holiday entitlement and a LOT of work at home) and the occasional train ticket, I hardly see them swallowing the costs of trips to the States when I can't even get there for international conferences... So, while I (for one) would sacrifice a lot to be able help to help, I too seem condemned to observe from the sidelines, unless, that is, someone has a suggestion? -- Simon North, Technical Author/Translator, SGML & Internet Consultant Private (TCP/IP): Directions (uucp): Daytime job: north@knoware.nl north@direct.iaf.nl db512@hgl.signaal.nl Newsgroups: comp.text.sgml Date: 06 Oct 1994 19:52:36 UT From: Simon North \ Message-ID: <32@direct.iaf.nl> References: \ Subject: Re: Microstar [John Turner] | Has anyone heard of a company called Microstar? Microstar (Microstar Software Ltd., 34 Colonnade Raod North, Nepean, Ontario, Canada K2E-7J6) are the producers of Near\&Far a package that runs under MS-Windows and Unix for the visualisation (graphical modelling) of DTDs. So far I have found it to be an excellent tool, especially for involving non-SGML people in the DTD design process. Let's face it, the moment that you try to involve someone in document analysis you almost immediately reach for a pen and paper and start drawing block diagrams. With this tool you can convert those drawings into a _documented_ DTD directly. It is also wonderful for finding your way round an existing DTD, letting you quickly SEE how it all hangs together. Put this tool together with the DTD2HTML package (and the EDC2HTML) package, set Mosaic lose on it and then tack on EASYDTD (a fantastic "quick and dirty" DTD editor) and you have, IMNSHO, a pretty powerful DTD workbench. All you need then to complete the suite is a good parser (the PSGML suite is excellent under EMACS, even if it won't work with the Windows EMACSes - or, if you got a few days to spare, the SGML wizard for MS-Word for Windows 6.0). Microstar are regular posters to this newsgroup; if you check the SGML archive via WAIS you can track down a few, or someone else will probably post all the details. Otherwise mail me and I'll send them on. Whatever, the package gets my full endorsement. Just my opinion, worth what you paid for it ... -- Simon North, Technical Author/Translator, SGML & Internet Consultant Private (TCP/IP): Directions (uucp): Daytime job: north@knoware.nl north@direct.iaf.nl db512@hgl.signaal.nl Newsgroups: comp.text.sgml Date: 06 Oct 1994 19:53:56 UT From: Peter Flynn \ Message-ID: \ References: \ \ Subject: Re: Microstar [John Turner] | Has anyone heard of a company called Microstar? Apparently they have a | graphical interface to SGML. I would appreciate any information about | their product. [Christoph Altenhofen] | I suppose you think of Near\&Far, a tool to handle DTDs in a graphical | manner. I saw a demo on the CeBit in Hannover this spring, but didn't | take a closer look at it. In my opinion it's a good tool to visualize | the dependencies and structure of a DTD, but I don't think that | creating a DTD is made so much easier as the traditional way, coding | DTDs by hand. I have their demo disk. It's pretty good, but it _is_ only a demo, you can't actually _do_ much with it. But the idea is nice and I think it might make a good teaching aid. Me, I'll stick to Emacs and psgml-mode. The disk and doc does say that they encourage copying of the demo, so I'll zip the contents when I get back to the office and put it on my ftp server: look next week on www.ucc.ie in pub/sgml for nearfar.zip or something like that. Mail me to hassle me if it doesn't appear :-) ///Peter Newsgroups: comp.text.sgml Date: 06 Oct 1994 21:33:40 UT From: Bernie J Scholz \ Organization: GE Corporate Research and Development Message-ID: \ Keywords: DTD, SGML Subject: Public Domain DTDs To whom it may concern: I am interested in finding out more information on public domain document type definitions (DTDs). I have the names of some: ISO 12803, AAP, DocBook, ATA/AIA, and J20008, and would like to know about them and if there are any others available. For each of the ones mentioned above, (and any others that you can think of) I would like to know how/where to obtain copies of the DTD and who or what is the governing body responsible for its maintenance. I am especially interested in a DTD which can be used for representing technical operator/maintenance manuals. I am familiar with the CALS and ATOS DTDs and am wondering if there are any other public domain DTDs out there which may meet my needs. Responses can be sent to scholz@crd.ge.com Please accept my sincerest apologies if my above request is already covered in a FAQ list. Just tell me where to find it and I'll be on my way...... Thanks in advance, Bernie Scholz -- Bernhard J. (Bernie) Scholz Computer Scientist Technical Systems Program Information Technology Lab GE Corporate Research and Development Schenectady, NY 12301 scholz@crd.ge.com Newsgroups: comp.text.sgml Date: 06 Oct 1994 23:53:47 UT From: Rick Jelliffe \ Message-ID: <9410060853.AA06092@ee.uts.edu.au> Subject: more on Japanese SGML, Ken Lunde & ISO 2022 Syun Tatiya, the Japanese TEI representative, tells me that the Japanese translation of SGML Standard (ISO 8879) is JIS X 4151. It apparantly has some additional annexes or notes on handling JIS X 0208. He doesn't know if these are available in English. Does anyone have the gist of these? (Uniscope?, Fujitsu?, Nihon Unitec?, EBT?) [Erik Naggum] | I was unimpressed when Ken Lunde told me that he had never seen ISO | 2022, much less the ISO 2375 character set registry. To defend Ken Lunde a little, his book is a survey of what is actually found in Japan (+ a little on Korea, Taiwan + China). This is an enormous effort: his book is the only thing available, and it is much better and broader than anyone could realistically hope for. If that standard is extraneous to his already complicated task, i.e., if in fact they are not in use, then I'm not surprised if he has shied away from reading them! He does mention that they exist though. BTW, what is ISO 2375? -ricko Newsgroups: comp.text.sgml Date: 07 Oct 1994 00:45:38 UT From: Rick Jelliffe \ Message-ID: <9410060945.AA06102@ee.uts.edu.au> References: \ <1994Sep27.154444.16794@ast.saic.com> \ <36bvg6$n9e@kaa.heidelbg.ibm.com> \ <36rvbbINNomn@afshub.boulder.ibm.com> Subject: Re: Multilingual HTML, SGML documents? [Peter Kerr] | ISO 10646 compliance could be a systematic way to ensure that | authoring/browsing systems work properly. [Wayne Wohler] | Yup. There are too many character sets, so let's make everyone use yet another one? (-;ricko Newsgroups: comp.text.sgml Date: 07 Oct 1994 05:17:10 UT From: Peter Bomberg \ Organization: Achilles Networking Inc, Ottawa, ON Message-ID: <372lkm$37q@hermes.achilles.net> References: <36p76b$c1@erinews.ericsson.se> Subject: Re: SGML to Postscript formatter [Peter Witt] | I am looking for a high speed formatter that format SGML documents | instances with graphics and tables to Postscript. There are some | products around on the market that I have seen, but I am not sure if I | have seen all of them. One of the problems when it comes to formatting | in a "production formatting/printing machine" is the lack of standards | for formatting data since the "production machine" may be a different | tool than the one in which the document was originally created. Does | anyone have any information concerning this? Peter, since you did not mention the products that you have looked at I am not sure if I'm just giving you old information. I have looked closely at using one of four products. CAPS (xerox), Pager (datalogics), I6sgml (interleaf) or arbortext publisher (arbortext). They all have pro's and con's but if you want any specific info then just ask :} Peter ps. What products have you looked at? (and what did I miss in my reaserch) ;-) -- Peter Bomberg pbx@achilles.net Sound Foundation tel. +1 613 241 2352 Newsgroups: comp.text.sgml Date: 07 Oct 1994 05:25:56 UT From: Peter Bomberg \ Organization: Achilles Networking Inc, Ottawa, ON Message-ID: <372m54$37q@hermes.achilles.net> Subject: word processor to SGML conversion I would like to ask if there is a good listing of products in this area? If so where and if not could anyone that has any info please email me. I am specificaly interested in WP 5.2 and WP 6.0, Frame and Word to SGML converters/translators. Peter -- Peter Bomberg pbx@achilles.net Sound Foundation tel. +1 613 241 2352 Newsgroups: comp.text.sgml Date: 07 Oct 1994 08:03:59 UT From: Gary Houston \ Organization: Actrix Information Exchange Message-ID: \ References: \ <1994Sep26.233841.22334@midway.uchicago.edu> \ <1994Oct6.142331.11581@midway.uchicago.edu> Subject: Re: Multilingual HTML, SGML documents? [Gary Houston] | So why does HTML need to specify a character set, really? You | could just as well use the transport layer, i.e., the MIME header | in HTTP: | | Content-type: text/html; charset=iso-8859-1 [Richard L. Goerwitz] | You're thinking about this like an American. I can sympathize, being | one too. The fact is that huge portions of the world's population are | multilingual, and it's not untypical for them to mix two languages in a | single document (sometimes more). No, since we are not specifying a character set it could just as easily be: Content-type: text/html; charset=unicode-1.1-utf-8 and if this isn't good enough there are still character entities. | do not even know that languages and fonts are different beasts (e.g., | they fail to recognize that Arabic runs right-left and that special | conventions apply to mixing it with English text). You would also want to add a "lang" attribute to each element, as is already done in some DTDs. I think this has already been proposed for HTML. Gary Newsgroups: comp.text.sgml Date: 07 Oct 1994 08:32:13 UT From: "Shawn T. Rutledge" \ Organization: Arizona State University Message-ID: \ References: <36q6sg$15b@hopper.acm.org> Subject: Re: CMC DTD (HPB/FDA) [hopp@acm.org] | Anyone here to discuss a DTD for CMC section of submission? Understand | this is being prepared by HPB and pretty much agreed to by FDA. As in | when and how to get ... No, but I'll gladly discuss with you the BGF paradigm for doing RTA, which was approved last summer by the YTS section of the DBW. The BS group didn't concur but got voted out. You know, the proposal from SGB looked a lot better IMHO. And BTW so did KJG's. Sorry, just couldn't resist... :-) -- _______ KB7PWD - now on packet! shawn.rutledge@asu.edu (_ | |_) html: http://enuxsa.eas.asu.edu/~rutledge/home.html __) | | \\__________________________________________________________________ * IEEE * ham radio * robotics * cyberspace * RISC * PIC * SunSparc * techno * *** Happy Birthday to the Internet! May it live long and prosper. *** Newsgroups: comp.text.sgml Date: 07 Oct 1994 13:15:00 UT From: "G. Ken Holman" \ Message-ID: \ References: <9410060853.AA06092@ee.uts.edu.au> Subject: Re: more on Japanese SGML, Ken Lunde & ISO 2022 [Rick Jelliffe] | BTW, what is ISO 2375? \ International Register of Coded Character Sets to be used with Escape Sequences 1. Introduction 1.1 General This document is the ISO International Register of Coded Character sets To Be Used with Escape Sequences for information interchange in data processing. It is compiled in accordance with the provisions of ISO 2022, "Code Extension Technique" and of ISO 2375 "Procedure for Registration of Escape Sequences". This International Register contains coded character sets which have been registered in accordance with procedures given in ISO 2375. Its purpose is to identify widely used coded character sets and associate with each a unique escape sequence by means of which it can be designated according to ISO 2022 and ISO 4873. The publication of this International Register should promote compatibility in international information interchange and avoid duplication of effort in developing application-oriented coded character sets. Registration provides an identification for a coded character set but implies nothing about its status; it may or may not be part of a standard of an international, national or a corporate body. However, if such a standard is published subsequently to the registration, it would be appropriate for the escape sequence identifying the character set to be specified in the standard. If it is desired to register a set, application should be made to the Registration Authority through an appropriate Sponsoring Authority as specified in ISO 2375. Any character set can be a candidate for registration if it meets the requirements of ISO 2375. The Registration Authority ascertains that the proposals received are formally in accordance with this International Standard, technically in accordance with ISO 2022, and, where applicable, with ISO 646 and ISO 4873, and meet the presentation practice of the Registration Authority. \ ISO/IEC JTC1/SC2 Character Sets and Information Coding is responsible for ISO 2022 and ISO 2375 and the registration procedures and criteria. The Registration Authority is the European Computer Manufacturers Association (ECMA). One of the standards used in many places (Canada, U.S. (Prodigy is an example), and Japan) is the North American Presentation Layer Protocol Syntax, NAPLPS, which is based on ISO 2022 and explicitly includes all the present and future character sets of ISO 2375 through the escape switching mechanisms. -- G. Ken Holman Phone: +1 613 727-5696 x317 /\\ /\\ Vice President, R\&D Fax: +1 613 727-9491 \\/ \\/ Computer Microstar Software Ltd. WATS: 1 800 267-9975 /\\ /\\ Aided 34 Colonnade Road North *** new email 94-06-22 *** \\/ \\/ Document Nepean Ontario E-mail: gkholman@microstar.com /\\ /\\ Engineering CANADA K2E-7J6 Info: CADE@microstar.com \\/ \\/ Newsgroups: comp.text,comp.text.desktop,comp.text.sgml,comp.lang.postscript Date: 07 Oct 1994 14:11:37 UT From: Lars Bruzelius \ Organization: Upsala University Computer Center, SE Message-ID: <373lce$l85@caesar.udac.se> Subject: Announce: 15th Annual Xplor Electronic Document Conference & Exhibit The 15th Annual Xplor Electronic Document Conference and Exhibit will take place November 6-11, 1994, at the Phoenix Civic Plaza, Phoenix, AZ. For more information take at look at: http://pc-78-120:8001/www/Xplor/15th_Annual_Conference/Conference.html Most of the 64 pp conference announcement have been made available in hyper text format. For additional information please contact: Xplor International, 24238 Hawthorne Blvd., Torrance, CA 90505-650, USA. Telephone: 1 800 669 7567; +1 310 373 3633; +31 2979 41 136 (European Office) Telefax: +1 310 375 4340 Internet: info@xplor.com Please say that you found it on the Web! -- Lars Bruzelius Uppsala University Computer Center Box 174, S-751 04 Uppsala, Sweden. Telephone: +46 (0)18 187731 Bitnet: uddri@seudac21 Telefax: +46 (0)18 516600 Internet: Lars_Bruzelius@udac.uu.se Newsgroups: comp.text.sgml Date: 07 Oct 1994 14:18:37 UT From: Robert Ducharme \ Organization: Courant Institute of Mathematical Sciences Message-ID: <373lbt$qsh@doc.cs.nyu.edu> References: <9409067814.AA781434242@njcorp.akbs.com> Subject: Re: Anyone have any info about WORD's recent SGML release? "Automatic tagging" may mean little more than taking something tagged with the MS Word style "whatever" and surrounding it with \\ in some kind of exported ASCII files. Automatic validation, with arbitrary DTDs, is the real question, and I've been following every snippet of news about this add-on that I could and still don't know much. I have heard that Microsoft employs many OmniMark programmers and uses SGML extensively for their CD-ROM products. Given this and the existence of the Word add-on, their lack of presence on comp.text.sgml is a bit surprising -- or, maybe their silence should tell us something. - Bob DuCharme Newsgroups: comp.text.sgml Date: 07 Oct 1994 14:59:04 UT From: INStatus \ Message-ID: <373nno$r1r@newsbf01.news.aol.com> Subject: Need Info on BBML -- Black Bird Markup Language Hi, Does anyone know anything about BBML -- Black Bird Markup Language? I've heard Microsoft uses this markup language for some of their publishing software and is thinking about using it for their online service. Will Microsoft succeed in trying to push this as a standard to co-opt HTML? How is it related to SGML? Any resources (Net-bound or otherwise) that can illuminate BBML? Any help appreciated. BTW, there was a message posted here in July about a FAQ for this newsgroup, but no response. I guess that means the answer is no? Thanks! INStatus@aol.com Newsgroups: comp.text.sgml Date: 07 Oct 1994 15:05:25 UT From: Bob Agnew \ Organization: Science Applications International Corp. Message-ID: <1994Oct7.150525.16523@ast.saic.com> References: \ Subject: Re: Microstar [John Turner] | Has anyone heard of a company called Microstar? Apparently they have a | graphical interface to SGML. I would appreciate any information about | their product. Sure, just email to cade@microstar.com but then the Microstar folks read this group regularly so I'm sure they'll be in touch with you. -- "If it moves salute it; if it doesn't move \ it." Newsgroups: comp.text.sgml Date: 07 Oct 1994 16:46:34 UT From: Howard Kaikow \ Organization: MV Communications, Inc. Message-ID: \ Subject: MS Word SGML Add on The following is uuencoded, zipped RTF of a fax I received. Could be errors in OCR or manual conversion, but it should do. section 1 of uuencode 5.21 of file x.zip by R.E.M. begin 644 x.zip M4$L#!!0````(`'M@1QU9;#6J&`T``(\\E```!````6,596V_;N!)^-^#_0.`< M(.W",>S\[0+=] M:&1Y.)SYYIL+Z:^9C\\4\\DS9HD2DHBB/ZWTA;SF>'AU^SPMD8%P8?CK+"NUK: MK,@KZ0/$658T_LN!^*AK".(=K,0'^O[9S\5SW+/.@9EGI`>PL6Y@6 M9G=?'1P?#U_BQV^]3@KHTSTJ;KV>/1N/=EX_H.2!+9.2^<'3X4O\\^*W723-] MND?%K=>W9!]0<7O#^>E!]P4^\\!>G!\\\\0\\Q#7!D(%$%.TPL%,H/O!PG6\AK)MM8.2_^[5;2*_$?J:_\U=H>'*%0X[5]0.B`A5!F"3YHM``?0>&[<()/UA5D3YBEYY7S MJG_F;)FEA?;H]&!^>'.3U=*79GYT-.,GU'YZ@I!$N_B<2;!*N3RKUDV5RR80 M5I1L`?*H"*XKZRQD1B-:"*F.!II2D.<)=Y$UZ'C6&*FM^)X((G3B7.?>!5=$ M\\=9&[U2;0Q"OM0]17+XY_U6<,:C:EN*C\OA,BR7\\3YJP]O7KU^_^'\\[*.@R*&8#R:)'=QP+(6X&8_( M$<$L:"QZ;Y9FCD]M[FMZ$:6/]%E;!38^0><:6TE;8A1MO(Y23&]VUA[P6A/] M7UE\\B$\\*\\K^R]`B?S'?O^WAW\\3\&A\\)T[+\\2CFX?TG/P)([ZEYPGK^:X`?$O- MT^\\QARC]AYI._[I!.WEB-,G'Z_0_9V:\\/IB?\\)^G3^G/X\"JCF^N-EB^,+YQGDTW%D1G9)K(:UUK\CU?G3[;0`A$2%6>F_\*3MLJM0K9W>HK: M'6Z5D`SK$*%FL7R#;I16$9YOP**\+[B"Q\\/-TQM-(* MHT3;Y:YN#%SKJ/&%*U@*S3?\&K6@'--ZCH$M[0E+RI=7YE5F+E49E;11P'<$& MFAJBE]K2.MSCRKJ5`57"5/R7M5@GC+,E8E=)DG4B-&"50!WX;2UMB]NN191E M22J4R\\>CM@8;@\\B@5AA`L>.$\Z3V$^['I_`##D\\TZ M$">TQ1@RP:01H8]>K&04&)#\P,"A+AM1 MJT(0`FKS;1Y;#UVD.81ALV*?AR1!`Y\&O>XJC9A$JZ?$+4N.A)1E)W$1SF\\8@ ME\&P%?XV"XY%:H;AHT#I2A/A;\\LMA("=BB=%T?M]"&[UDHN-.4_&6G"NT1=>D MJ`%CIYQQY9H#EQFU,%]:%T'4TE]AX#._>3,>=1QE5YF?6PQ$:/-*R(07CSP3 M?DRSW@0E>6[5>7JM9$0!W!'03&)>K@N="W002N<[]J.0G`@(`?7KQ,;6,Z%9 M1;\\SAM&1)2A-L-Y'M1]\VIQW!9K=*^45O_>!:T'J'3HLHZ6,$1\.$G.@I4+`R%9.AZ1K61"O_$. MOROOVK(2-2@M-V"\\>+G_X?WY9&"'6\&I848%B#V0#?BK>VL2EA!6J56V(?CWA M,E5(3C1:X>F[)82HRU22E48YO6AC1QKM24GN91%%`+_4.8A%:PR@0+@%1I>1 M'J3BVLKEPAA=$N\F2E-$6]H+`/XK>E"`YA_K=N'7EKC4*00Q08Q"HQ#!* M%G((0?KU#OSW5N/;F3+,_W-IUUL\&WF';H)].4"S$?2@*A(_ZSDZM,OH*AHUB M,J@H(DBMQ`5$\\.)"HO$3])1[)T564G-R1>K#U)@7;:!4":*U.@H9MXU[.K1\\ M/!KV);@\&JU)9:MP*-1*OT'H@U6WZP]HQW"0$-?C4\\:`F47+66>9='_3$S0%E M.[[CJNE.O1R@VWV@*6AH71K0WD::RC0:AT:\\V/;V_H#U,RK:69XC*HEL-`*# M?X'A;$V\-T87N>C-_E68PU-]X1^RGZGMW:EHY?R6H MEWCHYB5M.?9[H>].TJ#4>.35?J>)%(!=:@PTA;Z?GKK9K+?)PA*M#="9,R!C ME&78[:'=GLPARBJJZ5`X3\\V?CN4((EWC/3@/BL!$78M<6E\&XO`U;GQ"!S3BZ M8>NVJK?WV=Z/R6R,-%3OU@P5*>E`DJ*0M39:>OKF(7`F@UF8HX(&+B`-J.`G M'-3QJ$O!FCS>+1Q;0X?V$670.YY8<5\\,5%WC%A/J4=(@&(7,T;0H(\\KLLH%8 M0!=`U`5D=#Y0^,@\\;6.;;M8:A\\_[:-!^;M#,KNYP::\MWRUM\&PZG5A.<:5,1[`X./\&AV\\_J]-?AV>?B5>$6W5HEV>^(MM[=N[.\&P M_\'WA@?-EG`(5(;RA2MS1-..FZ/-)5<=H6H:7U> M4?&6(@"ZRF.D`D7M!E0W:V_;SQ]0?\\CX+<')J71NW*39`M`Q(/JI-O4XBN)X MU)^[!%\&DQN!JFWMF#S[?>_SJ3;@5_K-P:UID19+(BEMLQI@TUR6.?F..ZRP? M]FP:9EIMU*!7#?B9:C?J<46QCVOW0P6FV\&GDWVBQFPJ""K7:EN;&!\\&C*M&/@P-/'N:$\&G;_%)FDCE M)X#5-!`H'2GV45S">N$0D@M2U^O@[O8]N4F1^%AIK_8OI(]KOG`)XB483;W@ M!5?N_HQ^V2/]4)KN'"\&Z#6@`%\\P`&`!!IRZ[[CM!4IO0:T,ZLFN;FU9!V!G# MN*74M'()=CQ27`PW\\],>GUCI\\+7N6C'W2LE<[SD>V=F\&G4V'2Y9T18'^#B;% MKFM19J(4#8ETK[!C\\%2<=3\\,I`'OKF*^E$C7'*K/NS1W=7-[2$[?*C&2DH9T MPG4#/NK0WVRDHPO!(]NR'H\\(R@'8#TW$/[Z*#D?N03B&1$#WEWQ%1'0H+7N" MYB\\<#5;]354Z?/8GHSN9\\%P;(_[GZ`"37TU0)02M.CZ-1V=+::3-*_JE"0/N M\&NXP+SBFZYW,OS?O^>Z'+MFN<]R`EGYV"]*\9?G"M0W<\'9!':IG%^.W[T/2GE`[W>N#D1K[;3^W/GKL1'R"M+ M%R%\\W.9YUX`L^F&7[;B,E*%H#-^Z$!?IPV^M5-\\N\&NFN\\$.:8[C]_\'3T\\$WU66%7A1DY)N"B2,[[]"[6Y/:9V17(-C!>>7 M^R_?7]Z]2^U^M1.'N':[X$B\'(OXY M"_UYM'#=Z>)?0]<[=[>>S`>>W)\&C2^4>]ZO2N[:YLS2AS1=E#R^_`,NIRK<< M#[+PPNN<\&R:"<;:4VL@%3\46W]KJTH/IY!42M0^@FT%IR\\]"QOUS8G$+!?G;K#<,: M<(WAL2'244ZJI;11EMO4;XWIKI+(W%TO^42\\!+\\62M[YT2/_\\=']1_?O-HW_ MQM]:OF9%F!\\,:$4X=T5_2N3P4.H0@8IJ])*F>G^50MH74)Y\\-]\\1S/?^+#B] M5;NHS+WBZ7XOB'\ Newsgroups: comp.text.sgml Date: 07 Oct 1994 17:13:47 UT From: Raymond Leggiero x3412 \ Organization: Martin Marietta Message-ID: \ Subject: Active Systems Does anybody know if Active Systems is still around? If they are what is their email address. -- Ray Leggiero Internet: leggiero@e7sa.epi.syr.ge.com Martin Marietta Corp Phone : +1 315 456 3412 Electronics Park - 7 MD24 Fax : +1 315 456 0180 Syracuse, NY 13221-4840 "All opinions are mine and not those of my employer" Newsgroups: comp.text.sgml Date: 07 Oct 1994 18:30:23 UT From: Mark Walter \ Message-ID: \ References: <9409067814.AA781434242@njcorp.akbs.com> Subject: Re: Anyone have any info about WORD's recent SGML release? [Cheryl Conway] | I've been passed a news release regarding MicroSoft WORD (Redmond, WA, | Sept. 13) that says: | | "Microsoft Corporation (Nasdaq: MSFT) today announced Microsoft(R) SGML | Author for Word, an easy-to-use add-on tool for Microsoft Word 6.0 for | the Windows(TM) operating system for creating Standard Generalized | Markup Language (SGML). SGML Author hides the complexities of SGML, | allowing users to create SGML quickly without extensive training or | knowledge." | | => Does anyone know anything about this product? | | The release goes on to say: | | "Users no longer have to spend hours manually tagging documents -- SGML | Author does it for them automatically." | | => Particularly, I am curious about how this automatic tagging is | => done. Any input would be greatly appreciated! I saw Microsoft's product at its introduction a few weeks ago in San Francisco. It's an exciting product, because it melds SGML into a popular tool that can also be used for general word processing. Here's my short take on what it does: Microsoft's product is an SGML import/export filter for WinWord 6 developed with the help of Avalanche Development Corp. It is a sophisticated filter, one that includes an SGML parser and can be configured for arbitrary DTDs. In simple terms, it maps Word document template styles to SGML elements and attributes, but, in fact, it handles quite a few of the messy instances where these two document representations do not map on a one-to-one basis. It covers everything you can do in Word, including math and tables. Setting up the filter requires some knowledge of SGML (as one would also need to develop a DTD), but a graphic DTD visualizer is included to make it easier to point and click to objects being associated. Once the filter is set up, you can make a Word template and writers can simply use the "Save As" command to generate a valid SGML document. SGML Author also takes any errors and annotates the original Word document so authors can see what needs fixing. For more details, John Vail at Microsoft is the product manager. This month's \ newsletter has a more detailed 3-page writeup that includes screen dumps. If you don't already get \, they can be reached at +1 303 680 0875. PS: not sure if this is actually in beta yet; anyone on the beta list care to comment? -- Mark Walter Seybold Publications PO Box 644, Media, PA 19063 phone: +1 610 565 2480 fax: +1 610 565 4659 email: mwalter@seyboldpub.ziff.com Newsgroups: comp.text.sgml Date: 07 Oct 1994 18:59:20 UT From: "Claude L. Bullard" \ Message-ID: <9410071859.AA49368@source.asset.com> References: \ <1994Sep26.233841.22334@midway.uchicago.edu> \ <1994Oct6.142331.11581@midway.uchicago.edu> Subject: Re: Multilingual HTML, SGML documents? [Richard L. Goerwitz] | Biblical scholars customarily quote Greek, Hebrew, and sometimes | several other languages, in their papers and commentaries. Some | countries even require that everything be bilingual. That's better. Appearing to bash the American engineers for being practical isn't what you meant, I'm sure. You point out the difficulties of the WWW/HTML club by citing real applications. The scope of the application is always defined for a commercial application before one builds it. This prevents "gold-plating" as they say in military procurements. If every culture that wants a WebSite can shape it to their local requirements, then the plating is not that expensive, at least until another WebSite wants to read their document. MY point is this: the WWW is the Tower of Babel, in the metaphor of the Bible, AI applications, and in a very literal sense. To make it all things to all potential users will be an expensive and potentially fruitless endeavor. Its real utility and the major reason for its current wave of popularity is that it makes a portion of the content of the Internet *easy to navigate*. It is not an international standard nor can it be. It is not robust enough to be the basis of a "truly worldwide Web." Calling it that and calling its DTD THE Hypertext Markup Language are acts of hubris or simple naivete. It is the Internet WWW and the Internet HTML. That's the honest political base on which it depends and by which it is controlled. For those that are *willing* or that *agree* verbally or by contract to work within the boundaries defined for it by its *creators*, it can be very useful as with all artifacts that owe their existence to their political role. If it is to be required by those who control those politics to be "truly worldwide" than its creators must own up to the obligation and the responsibility to *serve* that large and ever changing user base. And no one can do that for free for long. Achieving the "holy" WWW may or may not be a task that is achievable in linear time and cost. But, by being championed as a *one size fits all solution*, it damages the market it seeks to create by creating a false perception that all one needs to play the Global Hypermedia Game is to get the freeware and a connection. While the goals are laudable and the existing accomplishments very useful, a more sober, and yes, more mature assessment of the requirements of those goals should be made. I would not be surprised if that was the essential reason for Mosaic Communications. Jim Clark knows a profitable enterprise when he sees one and if he can get a group of young zealous developers to dedicate their careers to building products on a diet of "liquid caffeine and the corporate account at Domino's" under the control of "old'uns" recruited from his last enterprise at Silicon Graphics, that's the goal of all good capitalists: a profitable product built by cheap labor under the aegis of an experienced management team. Venture capitalists always approve a business plan with those elements. But it isn't a Holy Crusade to "make all the information in the world available to the fingertips". One more "wet blanket" in this "long story": a Luddite perspective. One hundred years ago, the curve for the rate of technical evolution on this planet went almost vertical. Some speculate that it was the result of the decreasing limits on the integration and long term storage of information made possible by incremental improvements in transportation and communication (e.g., the train, the telephone, the telegraph). While the quality of life in some cultures improved, the wealth of nations moved quickly, and the technology became cheap and ubiquitous, the twentieth century of endless progress based on cheap information was also the century of the greatest inhumanity to man through global war, tyranny and destruction of minority cultures. Our spiritual impoverishment was only matched by the ability of small groups to dominate those whose will and resources were not sufficient to resist them. Cultures that face each other across narrowing borders often resort to violent means to resolve the ambiguities. Will it be the policy of the I-WWW to ensure that all opinions, beliefs and product descriptions be available? If I want to post the instructions for creating a virus that only affects Intel-based processors, will that be permitted? Who will stop me? Net-iquette? On the CTS, Erik will. |-) A few miles from where I sit and type this, live the men who designed and built the engines that can take a man to the moon or deliver a holocaust of nuclear fire to anywhere on a globe that I can address with a single push on the ENTER key. It was their dream to take humanity to the stars. It was not their dream to enslave humanity by terror. They did both. Technology knows nothing of morals nor can it. This may be another treatise on "the meaning of life", to Mr. Goerwitz, Mr. Connolly and others. Selah. It is from one who shares your vision but who watched a vision unfold before and knows that some dreams become nightmares for the dreamer. It is not a reason to quit and return to the caves, but it is a reminder to pause to wonder. Len Bullard Newsgroups: comp.text.sgml Date: 07 Oct 1994 19:04:49 UT From: Gerry Grenier <71031.2122@CompuServe.COM> Organization: John Wiley & Sons Message-ID: <37464h$b62$1@mhadf.production.compuserve.com> Subject: DTD to DTD conversion I am trying to find an efficient way of mapping and then parsing one SGML-DTD file into a similar, but different SGML-DTD file. I don't know if I'm better off writing a script in say, Paradox or if I can make life easier with off-the-shelf software. I have looked at literature for Avalanche software's The Hammer and see that one can map an SGML DTD file to a word processing application. If anyone has experienced The Hammer first hand, and thinks that it can handle my task, please let me know. -- Gerry Grenier John Wiley & Sons ggrenier@wiley1.jwiley.com Newsgroups: comp.text.sgml Date: 07 Oct 1994 20:36:41 UT From: "Richard L. Goerwitz" \ Organization: University of Chicago Message-ID: <1994Oct7.203641.24614@midway.uchicago.edu> References: \ <1994Oct6.142331.11581@midway.uchicago.edu> \ Subject: Re: Multilingual HTML, SGML documents? [Gary Houston] | and if this isn't good enough there are still character entities. Character entities are okay, but very cumbersome for docs containing substantial sections of text that use them. Also, they are not back- wards compatible with the many de facto standards used in many coun- tries. As you suggest, you could use Unicode coded via UTF-8, but Unicode has its own problems, and I don't think it's wise to hem ourselves into this or that scheme. There should, in my estimation, be a way to label sections of text as using X encoding for Y language, as well as a way to label the default encoding and language for the entire document. It may sound cumbersome (a criticism I leveled at character entities), but I think it's needed in order to get the required flexibility. Unicode is never going to satisfy everyone, and it's always going to be a good idea to allow people to switch, say, between variant 8-bit schemes. | You would also want to add a "lang" attribute to each element, as is | already done in some DTDs. I think this has already been proposed for | HTML. There was quite a discussion of this recently on the WWW mailing list. -- -Richard L. Goerwitz goer%midway@uchicago.bitnet goer@midway.uchicago.edu rutgers!oddjob!ellis!goer Newsgroups: comp.text.sgml Date: 07 Oct 1994 20:59:03 UT From: Kevin Brennan \ Organization: Honeywell ATS Message-ID: <374cqn$3l8@bmw.hwcae.az.Honeywell.COM> Subject: SGML "Viewers" We've been hearing about several SGML "viewers". I'd be interested in hearing pros/cons on a few. The most receint one I've seen is from Jouve. We're particularily interested in AIr Transport Association (ATA) SGML compliant viewers with Temporary Revision capability. Thanks in advance, -- Kevin Brennan Publishing System Lead Honeywell Air Transport Systems Phoenix, Az brennan@saifr00.ateng.az.honeywell.com Newsgroups: comp.text.sgml Date: 07 Oct 1994 21:58:04 UT From: Charles Hurley \ Organization: PANIX Public Access Internet and Unix, NYC Message-ID: <374g9c$ff3@panix.com> Keywords: YASP, MS-DOS, and copy Summary: Does anyone have a MS-DOS copy of YASP? Subject: YASP, does anyone have a MS-DOS copy? Hello to all, Does anyone have a MS-DOS copy of YASP? Or if not, can someone please tell me how to compile YASP with Borland C++ version 4.01. Please be as detailed as possible as to the fact that I am only an average C/C++ user. You may post or send your responses to the below e-address. Thank you, C. Hurley -- % Charles R. Hurley Phone: +1 212 633 3759 % Publishing Technologies Coordinator (TeX Projects) Fax: +1 212 633 3685 % Elsevier Science Inc. E-mail: churley@panix.com % 655 Avenue of the Americas, New York, N.Y. 10010-5107 Newsgroups: comp.text.sgml Date: 07 Oct 1994 22:07:23 UT From: Peter Flynn \ Organization: CERN European Lab for Particle Physics Message-ID: \ References: \ <1994Oct7.150525.16523@ast.saic.com> Subject: Re: Microstar [Bob Agnew | [Microstar] | Sure, just email to cade@microstar.com but then the Microstar folks | read this group regularly so I'm sure they'll be in touch with you. O do they? I mailed the address given on the demo disk I mentioned yesterday but haven't had a reply (this was over a week ago). If there's anyone from Microstar reading this, could they email me please: I want _prices_. :-) ///Peter Newsgroups: comp.text.sgml Date: 07 Oct 1994 22:10:58 UT From: Peter Flynn \ Organization: CERN European Lab for Particle Physics Message-ID: \ References: <9409067814.AA781434242@njcorp.akbs.com> \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Mark Walter] | Microsoft's product is an SGML import/export filter for WinWord 6 | developed with the help of Avalanche Development Corp. It is a | sophisticated filter, one that includes an SGML parser and can be | configured for arbitrary DTDs. In simple terms, it maps Word document | template styles to SGML elements and attributes, but, in fact, it | handles quite a few of the messy instances where these two document | representations do not map on a one-to-one basis. It covers everything | you can do in Word, including math and tables. | | Setting up the filter requires some knowledge of SGML (as one would | also need to develop a DTD), but a graphic DTD visualizer is included | to make it easier to point and click to objects being associated. | | Once the filter is set up, you can make a Word template and writers can | simply use the "Save As" command to generate a valid SGML document. | SGML Author also takes any errors and annotates the original Word | document so authors can see what needs fixing. This is excellent news. So long as they don't cripple it like WP did with Intellitag, this should be very useful. Do you know if the parser uses only the Reference Concrete Syntax, or can it accept arbitrary SGML Declarations? I'll be interested to see if it works with the TEI DTD :-) ///Peter Newsgroups: comp.text.sgml Date: 07 Oct 1994 23:19:55 UT From: Howard Kaikow \ Organization: MV Communications, Inc. Message-ID: \ References: <9409067814.AA781434242@njcorp.akbs.com> <373lbt$qsh@doc.cs.nyu.edu> Subject: Re: Anyone have any info about WORD's recent SGML release? I made a posting to a separate topic earler today on this. However, I do have the following concern that I sent to somebody who is an early user, I guess a Beta tester. ------------------------------------------------------------------------ Having now reviewed the Fact Sheet that describes the MS plans for SGML Author for Word, it would appear that Word would be able to create/input an SGML file. However, the formatting will still be restricted by the oversights in Word, e.g., to use my favorite example, you still will not be able to use dictionary style definitions and do automagic numbering of clauses and the table of contents. Hard coding is just unacceptable. It is unclear as to what restrictions will be placed on the specification of a DTD. Brings back memories of my past experience with mark up languages. Newsgroups: comp.text.sgml Date: 07 Oct 1994 23:22:02 UT From: "Todd G. Hivnor" \ Organization: Rensselaer Polytechnic Institute, Troy NY Message-ID: <374l6q$fuv@usenet.rpi.edu> Subject: Storage Models for SGML data I'm wondering what's going on with storage models for hypertext SGML data. I'm looking for a way to store large quantities of slowly evolving SGML files. Specifically, the data in question is government regulations, but the problem is probably more general. Things I'd like to see in the storage system: 1) Revision control (when, how, and by whom was the data changed? And, put it back the way it was) 2) Support for hierarchical documents (paragraph D of subchapter 3 of title blah blah blah) 3) An ability to maintain bi-directional links. (Suppose the legislature inserts a paragraph, shifting all of the remaining paragraphs down. How do I keep the pointers into the relocated paragraphs valid?) I could use a relational database, but I'm wondering if there isn't something out there which was designed with these problems in mind. Clues? Todd Hivnor hivnot@rpi.edu Newsgroups: comp.text.sgml Date: 07 Oct 1994 23:27:36 UT From: Erik Naggum \ Organization: Naggum Software; +47 2295 0313 Message-ID: <19941007T232736Z.erik@naggum.no> References: \ Subject: Re: MS Word SGML Add on (text version) [Howard Kaikow] | The following is uuencoded, zipped RTF of a fax I received. Could be | errors in OCR or manual conversion, but it should do. thanks for the contents. I have taken it out of its two proprietary worlds (PKZIP and RTF) and put it into the universal form of the free world: text. #\ --------------------------------------------------------------------------- Microsoft Introduces First SGML Authoring Tool for Microsoft Word New Add-On Tool Makes Creating SGML Easy REDMOND, Wash. - Sept. 13, 1994 - Microsoft Corporation today announced Microsoft* SGML Author for Word, an easy-to-use add-on tool for Microsoft Word 6.0 for the Windows* operating system for creating Standard Generalized Markup Language (SGML). SGML Author hides the complexities of SGML, allowing users to create SGML quickly without extensive training or knowledge. Users no longer have to spend hours manually tagging documents -- SGML Author does it for them automatically. SGML is an international standard that describes the relationship between a document's content and its structure. SGML allows document-based information to be shared and reused across applications and hardware platforms in an open, vendor-neutral format. It defines a methodology for "marking" the contents of a document such as the title, the author, a graphic, the date, or even specific categories of data, essentially turning the document into a database. SGML is used by organizations such as governments, airlines and pharmaceutical companies, among others, that are looking for ways to reuse and publish document-based information electronically. Because it is an open data specification, SGML enables companies to publish information through media such as CD-ROM, electronic viewers and paper. In the airline industry, manufacturers are investigating distributing their aircraft service bulletins electronically to be read with intelligent document viewing systems. For example, if a maintenance person needs a guide for adjusting a plane's landing gear, a viewing tool could assemble all the necessary information automatically. "Many companies are looking for easy-to-use, cost-effective applications like SGML Author," said Peter Pathe, general manager of the Word business unit at Microsoft. "SGML Author exends the power and ease of use of Word to the emerging demand for online documents and electronic publishing." SGML Author Makes It Easier to Create SGML According to InterConsult, Inc., a Massachusetts-based research firm, the most common request among companies is to make SGML easier to use. SGML Author simplifies the SGML creation process by allowing users to work entirely within Word's graphical word-processing environment. Users of SGML Author never see SGML "tags" in Word and are therefore shielded from the complexities of SGML so they can focus entirely on creating documents. Because users of SGML Author for Word are already working within a familiar word-processing environment, creating SGML can be quicker, easier and more cost-effective. Because SGML Author requires no programming, it also facilitates SGML creation by administrators. Using intuitive point-and-click management tools, administrators can configure SGML Author, to make their solutions operational quickly. SGML Author Leverages Users' Investment in Word With SGML Author, companies can leverage their investments in Word by adding SGML functionality without having to purchase a separate, dedicated SGML application. Because users of SGML Author are already familiar with Word, they can become productive quickly with minimal incremental training or knowledge of SGML. As organizations increasingly distribute and manage information electronically, they are looking to build publishing solutions based on off-the-shelf applications. "SGML Author for Word solidifies the position of Word as a robust authoring tool that can be used as a component in demanding publishing applications," said Mark Walter, senior editor at Seybold Publications, Inc. SGML Author and Third-Party Tools Deliver Complete SGML Solutions SGML Author is a single component of any SGML solution that usually includesapplications from many vendors. Microsoft's strategy is to work actively with third-party vendors to offer companies a complete, end-to-end SGML solution. A number of third-party vendors have created products and services that leverage their areas of expertise to specifically augment SGML Author. "Microsoft's SGML Author provides a significant boost for the SGML industry," said Bill Zoellick, president of Avalanche Development Company. "SGML Author for Word does an excellent job of complementing many existing third-party SGML tools." Microsoft is working with several companies supporting SGML Author, including Avalanche, Electronic Book Technologies, Interleaf, Inc., MicroStar Software and SoftQuad Inc. System Requirements Microsoft SGML Author for Word requires a PC with an 80386 or higher microprocessor, Microsoft Word 6.0 forWindows or later, theMS-DOS operating system version 3.0 or later, 4 MB of memory minimum (6 MB recommended), 3 MB of hard disk space, and a Microsoft Mouse or compatible pointing device. Also required is one of the following: Microsoft Windows version 3.1 or later, Microsoft Windows\\'99 for Workgroups version 3.1 or higher, or Microsoft Windows\\'99 for Pen Computing. Pricing and Availability Microsoft SGML Author for Word will be available for approximately $595. It is expected to begin shipping by the end of the fourth quarter of 1994. Founded in 1975, Microsoft (NASDAQ "MSFT") is the worldwide leader in software for personal computers. The company offers a wide range of products and services for business and personal use, each designed with the mission of making it easier and more enjoyable for people to take advantage of the full power of personal computing every day. ######## Microsoft and MS-DOS are registered trademarks and Windows is a trademark of Microsoft Corporation. Editor's Note: MS-DOS is a trademarked product name. Please do not abbreviate in any way. -- Microsoft is not the answer. Microsoft is the question. NO is the answer. Newsgroups: comp.text.sgml Date: 08 Oct 1994 00:27:25 UT From: Howard Kaikow \ Organization: MV Communications, Inc. Message-ID: \ References: \ <19941007T232736Z.erik@naggum.no> Subject: Re: MS Word SGML Add on (text version) [Howard Kaikow] | The following is uuencoded, zipped RTF of a fax I received. Could be | errors in OCR or manual conversion, but it should do. [Erik Naggum] | thanks for the contents. I have taken it out of its two proprietary | worlds (PKZIP and RTF) and put it into the universal form of the free | world: text. Ah shucks, now the purty formats are not present. Newsgroups: comp.text.sgml Date: 08 Oct 1994 00:51:47 UT From: Dave Peterson \ Organization: ACM Network Services Message-ID: <374qf3$rff@hopper.acm.org> References: <31@direct.iaf.nl> Subject: Re: 5870 - Canadian participation in ISO/IEC JTC1/SC18/WG8 [Simon North] | I'm afraid I have to echo Peter's sentiments and his request. In The | Netherlands the national standards organisation (the NNI) has had | almost all of its government subsidies scrapped and is forced to try | and support itself. | | The company I work for has had the foresight, wisdom and understanding | to support my participation in a standards commission for the past two | years while we try to define some basic acceptance criteria for the | manuals supplied with consumer products. This means that my time has | been paid for and all my travelling expenses. While I empathize, let me point out that in the US no one is government subsidized unless their regular job is under government contract or they work regularly for the government. Companies become (and pay to be) members of the American National Standards Institute (ANSI), which is not a government body, and those companies must fund their staff's participation. The best we might get is to have some of the ANSI fees waived for micro companies like mine. But my attendance at ANSI and ISO essentially comes directly out of my pocket. I either pay it or don't participate. Dave Peterson SGMLWorks! Newsgroups: comp.text.sgml Date: 08 Oct 1994 01:34:11 UT From: "Richard L. Goerwitz" \ Organization: University of Chicago Message-ID: <1994Oct8.013411.8851@midway.uchicago.edu> References: \ <1994Oct6.142331.11581@midway.uchicago.edu> <9410071859.AA49368@source.asset.com> Subject: Re: Multilingual HTML, SGML documents? [Claude L. Bullard] | To make it all things to all potential users will be an expensive and | potentially fruitless endeavor. This seems to be the real point of what Mr. Bullard is saying, and I can't contest it. I would only add that the Web is not, by nature, a traditional, localized, commercial product. It aspires to much more than this. Part of this vision is that of a globalized information resource. If such a resource were confined to documents written in Latin script, my feeling is that it would not be worthy of the title "world wide." It may be that practical necessity prevents all WWW clients from being able to do all the world's major scripts (e.g., Chinese, Devanagari, Greek, Latin, Hebrew...). But I would hope that servers would be able to provide information in these forms. The main obstacle, after all, is the display end - not the intrinsic document structures and coding schemes. -- -Richard L. Goerwitz goer%midway@uchicago.bitnet goer@midway.uchicago.edu rutgers!oddjob!ellis!goer Newsgroups: comp.text.sgml Date: 08 Oct 1994 01:38:44 UT From: "Christopher R. Maden" \ Organization: Electronic Book Technologies, Inc. Message-ID: \ References: <9409067814.AA781434242@njcorp.akbs.com> <373lbt$qsh@doc.cs.nyu.edu> \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Howard Kaikow] | Having now reviewed the Fact Sheet that describes the MS plans for SGML | Author for Word, it would appear that Word would be able to | create/input an SGML file. However, the formatting will still be | restricted by the oversights in Word, e.g., to use my favorite example, | you still will not be able to use dictionary style definitions and do | automagic numbering of clauses and the table of contents. Hard coding | is just unacceptable. My understanding of the Word add-on is based entirely on what I saw in the article in \. However, Dale Waldt writes: \ Another nice feature is that content can be generated during import to Word styles, or removed when exported to SGML. The following data: Chapter 1. Dogs Could be translated to: \Dogs Where the content "Chapter" and the period would be stripped on export to SGML (generated on import from SGML) and the number of the chapter put into an attribute during export. \ So, it appears that Word can not auto-generate the numbers, but would anyone using SGML Author really use it to as output as well as authoring? (Of course.) However, any decent SGML application could easily auto-number chapters or handle dictionary style definitions from the SGML generated by SGML Author. For instance, SGML Author could export to \Dogs instead; the SGML application processing the document could auto-number all \ elements. SGML Author may even have the capability to auto-number; but if so, it's not clear from the \ article. | It is unclear as to what restrictions will be placed on the | specification of a DTD. SGML Author looks pretty flexible about the DTD; I'm not entirely sure if it can import an arbitrary DTD, but it appears so. Quote reproduced without permission from \, October 1994, copyright 1994 SGML Associates, Inc. Standard disclaimer, etc. -Chris -- Christopher R. Maden Electronic Book Technologies, Inc. Applications Consultant One Richmond Square crm@ebt.com Providence, Rhode Island 02906 USA +1 401 421 9550 (voice) +1 401 421 9551 (fax) Newsgroups: comp.text.sgml Date: 08 Oct 1994 01:51:18 UT From: Dave Peterson \ Organization: ACM Network Services Message-ID: <374tum$rff@hopper.acm.org> References: \ Subject: Re: response to comments on miniSGML [Rick Jelliffe] | And, in SGML tag names have no meaning too. They are not given meaning | by the DTD, merely structural interpretation. Depends on the completeness of your DTD. "Meaning", AKA "semantic role" (i.e., "what semantic role is this type of element intended for?), is not something an SGML parser cares about--any more than a RDBMS cares about the semantic role of the data in its tables. Therefore, there is no point in SGML prescribing a formal way of specifying semantic roles--generic SGML software can't make any use of the information. ISO 8879 does, however, permit/reccommend that you include informal information in the DTD (not necessarily in the machine-readable part), and in particular it does suggest including semantic roles for your element types and attributes. A Document Type Definition should be more than just the declarations you want included in the document type declaration. | Warren also likes Empty elements. But an empty element is the | equivalent of an normal element which doesn't happen to have any | content in the instance. Since I make the brave assumption that data | is valid (I suppose this means not generated by a human, in practise) | there is no problem. (In fact, I think the Standard says that Empty | elements pass both a start and end tag to the application in the ESIS | (in the annex or appendix on ESIS?)). If by "the Standard" you mean ISO 8879, not so. ESIS is not mentioned. ESIS is mentioned in the new SGML conformance standard, but that standard is specifying what should be output from a parser (or more accurately, from a testing application coupled to a parser) in what is really a test report. Personally, I think of EMPTY elements as markers in the text telling me that something machine generated or non-SGML goes here, or perhaps (using attributes) telling me something about the data that doesn't apply to one particular element (like column information placed inside a row-oriented table). ISO 8879 certainly encourages this "marker in the text" point of view. While ordinary elements have their content bracketed by a pair of tags, even if that content happens to be empty, EMPTY elements do not have a pair of tags, just one tag that says "here goes something special". Dave Peterson SGMLWorks! davep@acm.org Newsgroups: comp.text.sgml Date: 08 Oct 1994 02:57:08 UT From: Gary Houston \ Organization: Actrix Information Exchange Message-ID: \ References: \ <1994Sep26.233841.22334@midway.uchicago.edu> \ <1994Oct6.153534.22105@exoterica.com> Subject: Re: Multilingual HTML, SGML documents? [Gary Houston] | I have never seen a programming language which required a character set | declaration. The same technique can be applied to SGML, as Tim Bray | has already pointed out. [Norbert Winklareth] | SGML is not a programming language, it has no semantics. This is it's | major strength and weakness. Unfortunately most people both | over/under-estimate what SGML is capable of doing. I don't follow you, what do semantics have to do with character sets? | SGML is about input representations, that is why character sets are an | issue. Character sets are an issue for all text representations, so it seems reasonable to look for more general solutions which can be applied to SGML. Newsgroups: comp.text.sgml Date: 08 Oct 1994 03:52:47 UT From: Chul-Woong Yang \ Organization: Korea Research Environment Open Network (KREONet) Message-ID: <37552f$epc@news.kreonet.re.kr> Subject: Hypermedia system & DB query (Q) Hi. I want to know about existing hypermedia system. My prototype for hypermedia information system will support for database query. I consider two possible ways. (1) make database server for database query. hypermedia system will propagate db query to db server and get result (2) include query(ex. dynamic sql) in hypermedia system. I want to know how existing system cope to this situation, in detail. And also want to know whether better way exists, and how to. Thanks in advance. -- Yang, Chul-woong | Mail:cwyang@dbserver.kaist.ac.kr o ,__o MS2 / Research Staff | Tel :+82-42-869-5998 (Room) \\-\\_<, Database Laboratory | +82-42-869-3653 (Office) (*)/'(*) Computer Science Dept.| Fax :+82-42-869-3510 "~~'~~"~"~''~""~~'~~"~~"'~"~"~~' KAIST | HTTP://dbserver.kaist.ac.kr/WWWHOME/dbman.html#cwyang Newsgroups: comp.text.sgml Date: 08 Oct 1994 04:28:34 UT From: Yuri Rubinsky \ Organization: SoftQuad Inc., Toronto, Canada Message-ID: <1994Oct8.042834.5513@sq.sq.com> Keywords: sgml conference news Summary: Readers are invited to submit short newsy items Subject: CALL FOR SGML NEWS ITEMS for SGML '94 C A L L F O R B I T S _________________________ By now a five-year tradition, the SGML Year in Review continues to open both the North American and European SGML series of conferences. As always, it attempts to represent exciting new projects and initiatives as well as provide updates on on-going work of interest to the conference attendees and to the readers of this newsgroup and the half-dozen newsletters that now reprint the presentation. SGML '94 is fast approaching (Nov 6 tutorials; Nov 7-10 conference; Nov 10-11 SGML Open events), and with it, the Year in Review Call for Contributions. THIS IS YOUR INVITATION to sing your own praises, toot your own horn, give the world a paragraph or two (maximum!) to describe: 1. Relevant Standards Activity 2. Governmental (or EC/CE or other international) Initiatives 3. Major Industry Initiatives 4. New (since Dec '93) and Forthcoming Publications 5. Major or Especially Interesting SGML Implementations 7. Any SGML Activity that fits in the "Miscellaneous" Category Of particular interest are updates on any activities reported in previous Year in Review collections. EVEN IF YOU ARE PLANNING TO SPEAK ABOUT THE RELEVANT ITEM AT SGML '94 IN A TALK OR POSTER SESSION, PLEASE ALSO SEND A FEW SENTENCES FOR The Year in Review. As always, thanks in advance. Please ensure we have the relevant bits by Nov 1, 1994. Email submissions to: yuri@sq.com Please put either "SGML '94" or "Year in Review" in the subject line. -- Yuri Rubinsky +1 416 239 4801 Chairman, SGML '94 +1 800 387 2777 (from US only) President, SoftQuad Inc. uucp: {uunet,utzoo}!sq!yuri Suite 810 56 Aberfoyle Crescent Internet: yuri@sq.com Toronto, Ontario, Canada M8X 2W4 Fax: +1 416 239 7105 Newsgroups: comp.text.sgml Date: 08 Oct 1994 04:36:34 UT From: Chiradeep Vittal \ Organization: Computing Science, U of Alberta, Edmonton, Canada Message-ID: <3757ki$4e5@scapa.cs.ualberta.ca> References: <374l6q$fuv@usenet.rpi.edu> Subject: Re: Storage Models for SGML data [Todd G. Hivnor] | I'm wondering what's going on with storage models for hypertext | SGML data. I'm looking for a way to store large quantities of | slowly evolving SGML files. Specifically, the data in question | is government regulations, but the problem is probably more general. | | Things I'd like to see in the storage system: | | 1) Revision control (when, how, and by whom was the data changed? | And, put it back the way it was) | 2) Support for hierarchical documents (paragraph D of subchapter 3 of | title blah blah blah) | 3) An ability to maintain bi-directional links. (Suppose the | legislature inserts a paragraph, shifting all of the remaining | paragraphs down. How do I keep the pointers into the relocated | paragraphs valid?) | | I could use a relational database, but I'm wondering if there | isn't something out there which was designed with these problems | in mind. Documents are complex entities (hierarchical, as you point out). An RDBMS is hardly the ideal candidate to model this complexity. An object oriented DBMS would be the ideal candidate because it can do all the 3 things you have listed above. I'm currently implementing an SGML/HyTime database on a commercial OO DBMS. A tech report should be available within a couple of weeks. Our target application is multimedia news, and we assume zero updates to the document, so our storage model may not be entirely appropriate for your needs. Cheers -- Chiradeep Chiradeep Vittal | Office : #662 General Services. Graduate Student | Home : 3B/9008, 112 St, HUB Mall Dep. of Computing Science | Edmonton, Canada T6G 2C5 University of Alberta | Ph : +1 403 492 7591(O),+1 403 439 4529(R) Edmonton, Canada T6G 2H1 | http://web.cs.ualberta.ca/~vittal Newsgroups: comp.text.sgml Date: 08 Oct 1994 04:41:59 UT From: "Kenneth R. Kress" \ Organization: That's Entertainment BBS Message-ID: <1994Oct8.044159.19125@phent.UUCP> References: \ <36s4r0$p0c@yucca.ossi.com> Subject: Re: What DTD to use for Man-Pages? [Ralph Ferris] | Manual page markup is thoroughly covered in the DocBook DTD, which has | become the de facto standard for software technical documentation. | DocBook was originally developed by HaL Computer Systems and O'Reilly & | Associates (who hold the copyright); anyone can use it freely provided | the copyright notice is included. DocBook's development is under the | control of the Davenport group. I suggest getting on their alias to | keep up with developments. You can get information on how to do so by | logging into ftp.ora.com via anonymous ftp and following the | instructions in ./pub/davenport/toget.docbook I retrieved the shar package and the postscript documentation (which apparently was processed with dvips and TeX), but was disappointed there was no mention of taking docbook through TeX to some form of output. Is there a docbook to (La)TeX ASP file as there is with linuxdoc? Am I missing something? Ken. -- :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: Kenneth R. Kress dsinc!phent!kkress Newsgroups: comp.text.sgml Date: 08 Oct 1994 12:20:35 UT From: Sean Mc Grath \ Organization: Digitome Ltd. Message-ID: <3762qj$4p6@finnegan.iol.ie> Subject: Re: YASP, does anyone have a MS-DOS copy? [Charles Hurley] | Does anyone have a MS-DOS copy of YASP? I'd like to piggy back on this one if I may. I have looked through the compilation scripts, etc., for YASP but I do not have access to an OS/2 box to compile up a DOS version as required. I guess with a bit of work the REXX script could be analysed to produce a DOS makefile for Microsoft or Borland and the code compiled directly on a DOS based PC. If someone has already been done, I would appreciate any information. If it has not been done, we will do it and post details here. Either which way, a DOS binary of YASP compiled using the existing OS/2 hosted compilation process would be very useful. Any takers? Regards, Sean Mc Grath \ Newsgroups: comp.text.sgml Date: 08 Oct 1994 12:23:16 UT From: Tim Bray \ Organization: MIND LINK! Communications Corp., Langley, BC, Canada Message-ID: <3762vk$irc@deep.rsoft.bc.ca> References: <374cqn$3l8@bmw.hwcae.az.Honeywell.COM> Subject: Re: SGML "Viewers" [Kevin Brennan] | We've been hearing about several SGML "viewers". I'd be interested in | hearing pros/cons on a few. At SGML '94, I'll be be giving a talk which, while it may not have detailed pros/cons on any particular viewer (I'd never *heard* of Jouve's) will at least try to present a taxonomy and framework for thinking about them. | We're particularily interested in AIr Transport Association (ATA) SGML | compliant viewers with Temporary Revision capability. What's Temporary Revision capability in a viewer? Inquiring minds want to know. Cheers, Tim Bray, Open Text Corporation Newsgroups: comp.text.sgml Date: 08 Oct 1994 12:35:06 UT From: Tim Bray \ Organization: MIND LINK! Communications Corp., Langley, BC, Canada Message-ID: <3763lq$j2b@deep.rsoft.bc.ca> References: \ \ <1994Oct6.142331.11581@midway.uchicago.edu> <9410071859.AA49368@source.asset.com> Subject: Re: Multilingual HTML, SGML documents? [Claude L. Bullard] ... in a long intriguing article about design goals and the big picture in the worlds of SGML and HTML. I agree with a lot of it. But right off the top, he sez: | The scope of the application is always defined for a commercial | application before one builds it. This prevents ... Wrong. Wrong. Wrong. Wrong. The scope of a commercial application is *never* defined in advance. If its designers have any brains, they will assume that new and unforeseen demands will be placed upon it, and new requirements will arise for the use of its data. Thus they will use techniques such as data-independence and SGML and data dictionaries and so on to maximize the chance of this happening. This is one of the reasons why getting language portability right is central. And it should be done without requiring a lot of complex machinery. The scope of *military* applications is often attempted to be specified in advance. That's why they so often fail and always cost more than is reasonable, becuse the scope is guaranteed to be wrong. Cheers, Tim Bray, Open Text Newsgroups: comp.text.sgml Date: 08 Oct 1994 14:00:16 UT From: Peter Flynn \ Organization: CERN European Lab for Particle Physics Message-ID: \ References: <374cqn$3l8@bmw.hwcae.az.Honeywell.COM> <3762vk$irc@deep.rsoft.bc.ca> Subject: Re: SGML "Viewers" [Tim Bray] | What's Temporary Revision capability in a viewer? Inquiring minds | want to know. The ability of a TV audience watching a politician speaking to adjust their political views to match those being presented :-) ///Peter Newsgroups: comp.text.sgml Date: 08 Oct 1994 14:12:19 UT From: Peter Flynn \ Organization: CERN European Lab for Particle Physics Message-ID: \ References: \ \ <1994Oct6.142331.11581@midway.uchicago.edu> <9410071859.AA49368@source.asset.com> <3763lq$j2b@deep.rsoft.bc.ca> Subject: Re: Multilingual HTML, SGML documents? [Claude L. Bullard] | The scope of the application is always defined for a commercial | application before one builds it. This prevents ... [Tim Bray] | Wrong. Wrong. Wrong. Wrong. The scope of a commercial application | is *never* defined in advance. If its designers have any brains, they | will assume that new and unforeseen demands will be placed upon it, and | new requirements will arise for the use of its data. Thus they will | use techniques such as data-independence and SGML and data dictionaries | and so on to maximize the chance of this happening. Nearly. The designers usually have the brains, but it's the level at which the design is approved which causes the hiccups. Try as they may, designers who specify (eg SGML) for the reasons you give will often hit the brick wall of management-who-know-better-than-the-expert, who specify that it "must use MS-Word" or "must use corporate standards" (being CrudWrite DeLuxe or something similar) -- often for perfectly understandable but _short-term_ reasons. Which is why I actually turn down business on occasions. If the brick wall is too thick thick, I'm not going to allow myself or my colleagues or programmers to be thrown to the lions for following blind alleys. I leave that to others \ | This is one of the reasons why getting language portability right is | central. And it should be done without requiring a lot of complex | machinery. Ping! Bray 1: World 0. | The scope of *military* applications is often attempted to be specified | in advance. That's why they so often fail and always cost more than is | reasonable, becuse the scope is guaranteed to be wrong. Hehehehe. Who was it authorised the (doubtless apochryphal) mass program of stable-building in the last century "because the number of horses required by the US Army will increase X-fold in the next 100 years" or some such? ///Peter Newsgroups: comp.text.sgml Date: 08 Oct 1994 14:16:32 UT From: Peter Flynn \ Organization: CERN European Lab for Particle Physics Message-ID: \ References: \ <19941007T232736Z.erik@naggum.no> \ Subject: Re: MS Word SGML Add on (text version) [Howard Kaikow] | The following is uuencoded, zipped RTF of a fax I received. Could be | errors in OCR or manual conversion, but it should do. [Erik Naggum] | thanks for the contents. I have taken it out of its two proprietary | worlds (PKZIP and RTF) and put it into the universal form of the free | world: text. [Howard Kaikow] | Ah shucks, now the purty formats are not present. Ah, you mean Content-type: text/free i.e., content-free as well :-) ///Peter Newsgroups: comp.text.sgml Date: 08 Oct 1994 14:23:45 UT From: Peter Flynn \ Organization: CERN European Lab for Particle Physics Message-ID: \ References: <31@direct.iaf.nl> <374qf3$rff@hopper.acm.org> Subject: Re: 5870 - Canadian participation in ISO/IEC JTC1/SC18/WG8 [Simon North] | I'm afraid I have to echo Peter's sentiments and his request. In The | Netherlands the national standards organisation (the NNI) has had | almost all of its government subsidies scrapped and is forced to try | and support itself. | | The company I work for has had the foresight, wisdom and understanding | to support my participation in a standards commission for the past two | years while we try to define some basic acceptance criteria for the | manuals supplied with consumer products. This means that my time has | been paid for and all my travelling expenses. [Dave Peterson] | While I empathize, let me point out that in the US no one is government | subsidized unless their regular job is under government contract or | they work regularly for the government. Companies become (and pay to | be) members of the American National Standards Institute (ANSI), which | is not a government body, and those companies must fund their staff's | participation. | | The best we might get is to have some of the ANSI fees waived for micro | companies like mine. But my attendance at ANSI and ISO essentially | comes directly out of my pocket. I either pay it or don't participate. This is perfectly acceptable in a large, rich and powerful country where individuals are able to retain the greater portion of their earnings. In small countries run by spavined governments, where individual net earnings cannot cover the costs, we have to try and persuade the rulers that some of their take should be returned in the form of subsidy for these activities, rather than be spent of redecorating ministerial offices. What we ought to be doing, of course, is replacing the government...:-) But there ain't no votes in SGML...yet. ///Peter Newsgroups: comp.text.sgml Date: 08 Oct 1994 14:42:50 UT From: Peter Flynn \ Organization: CERN European Lab for Particle Physics Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> Subject: Re: Storage Models for SGML data [Todd G. Hivnor] | I'm wondering what's going on with storage models for hypertext | SGML data. I'm looking for a way to store large quantities of | slowly evolving SGML files. Specifically, the data in question is | government regulations, but the problem is probably more general. | | Things I'd like to see in the storage system: | | 1) Revision control (when, how, and by whom was the data changed? | And, put it back the way it was) Bibliographical support is essential in any DTD designed for large-scale or long-term storage. The TEI's header is perhaps the best recent example. | 2) Support for hierarchical documents (paragraph D of subchapter 3 of | title blah blah blah) This is crucial and remains unaddressed by any SGML software I have seen. Before you all jump on me and tell me (a) you can say \
or similar; or (b) that editors etc can activate on-screen section numbering deduced from \
or similar elements, what I mean is support for Canonical Reference. If I am in the midst of an instance, having got there via, say, a search routine which has returned the exact occurrence of "foo" that I was looking for, I now want to know where I am: I want the database system to return not just "`foo' found at offset 123456" but something like "`foo' found in \ #14 in \ #5 in \ #8 in \ #3", according to some rules I specified before the search saying that for all hits I want the structure returned using such-and-such a hierarchy. The TEI proposals (P3) provide an interesting mechanism for doing this in reverse: you can specify \
-something and \

-something else and the hooks are there for software authors to use to allow this kind of navigation. But from my reading of it, it is directed at retrieval where you already know the Canonical Reference in advance, and just want to look it up (eg `find me the text of 1 Kings 21:21') rather than the search-oriented lookup, where you find the word or phrase and want to derive the pointers. If I've misunderstood the TEI's take on this, doubtless Lou or Mike will put me right :-) but I still maintain that the need is there for search engines to be able to return Canonical References. Without it, support for hierarchical references is a one-way street, as you have rightly spotted. | 3) An ability to maintain bi-directional links. (Suppose the | legislature inserts a paragraph, shifting all of the remaining | paragraphs down. How do I keep the pointers into the relocated | paragraphs valid?) This is much easier. Use referential mnemonics, not hard-wired locations. [La]TeX has been doing this for over a decade, HTML for a few years. Now that some of the mechanisms for using hypertextish linking between and within files are becoming better-established, it should be relatively easy to decide on a system to allow database maintainers to insert these links. Keeping the referential integrity is another problem: it's easy to point from parent to increasing numbers of children, but much harder to check back when one of the parents dies. A relational database, as you suggest, would handle the problem, but most of these don't scale, or do so only at much increased cost, both in terms of software and of the hardware needed to support it. My, it's a lovely sunny morning outside...:-) ///Peter Newsgroups: comp.text.sgml Date: 08 Oct 1994 14:49:43 UT From: Peter Flynn \ Organization: CERN European Lab for Particle Physics Message-ID: \ References: \ <36s4r0$p0c@yucca.ossi.com> <1994Oct8.044159.19125@phent.UUCP> Subject: Re: What DTD to use for Man-Pages? [Ralph Ferris] | Manual page markup is thoroughly covered in the DocBook DTD, which has | become the de facto standard for software technical documentation. | DocBook was originally developed by HaL Computer Systems and O'Reilly & | Associates (who hold the copyright); anyone can use it freely provided | the copyright notice is included. DocBook's development is under the | control of the Davenport group. I suggest getting on their alias to | keep up with developments. You can get information on how to do so by | logging into ftp.ora.com via anonymous ftp and following the | instructions in ./pub/davenport/toget.docbook [Kenneth R. Kress] | I retrieved the shar package and the postscript documentation (which | apparently was processed with dvips and TeX), but was disappointed | there was no mention of taking docbook through TeX to some form of | output. Is there a docbook to (La)TeX ASP file as there is with | linuxdoc? Am I missing something? I'm using Silmaril's SGML2TeX which happily lets you convert arbitrary (but non-minimised, pre-validated) SGML instances (including DocBook) into TeX files, and creates a .sty file which you then fill out with whatever hooks into plain TeX or LaTeX you need (or you can pre-code them in a config file). You can still get the \β version (which doesn't handle DOCTYPE) from ftp://www.ucc.ie/pub/sgml/sgml2tex.zip (PC version only as yet). \γ release in a few months. ///Peter Disclaimer: I wrote it. Newsgroups: comp.text.sgml Date: 08 Oct 1994 18:08:03 UT From: Jeffrey McArthur \ Organization: ATLIS Publishing Message-ID: <378h9d$e12@news.delphi.com> References: <374l6q$fuv@usenet.rpi.edu> Subject: Re: Storage Models for SGML data [Todd G. Hivnor] | Things I'd like to see in the storage system: | | 1) Revision control (when, how, and by whom was the data changed? And, | put it back the way it was) I have done this. This is relatively easy to do. We maintain a simple relational database of each document. The database keeps track of when, who, and what the person changed. We also track how long it took them to make the changes. The biggest difficulty with this approach is it makes it very difficult to use "off-the-shelf" tools like Author/Editor. A/E is a wonderful tool if you use their pre-compiled data format. But if you have to import and export data into A/E every time you use it, it "sucks dead bunnies through a straw". This is particularly nasty if you want to edit "medium sized" documents in the range of 10-20 Meg in size. Editing large documents (40 Meg or larger) is almost impossible. If your documents fall in the small category (less than 1 meg) you can probably put up with the hassles of importing and exporting with A/E. Version control has saved us a lot of time and money. Unfortunately mixing version control with most GUI based SGML tools is way too hard. That is why we end up using things like PSGML instead. This works. -- Jeffrey M\\kern-.05em\\raise.5ex\\hbox{\\b c}\\kern-.05emArthur a.k.a. Jeffrey McArthur email: j_mcarthur@bix.com phone: +1 301 210 6655 fax: +1 301 210 4999 home: +1 410 290 6935 The opinions express are mine. They do not reflect the opinions of my employer. My access to the Internet is not paid for by my employer. Newsgroups: comp.text.sgml Date: 08 Oct 1994 18:15:24 UT From: Namhoon Yoo \ Organization: University of Southern California, Los Angeles, CA Message-ID: <376njs$g9u@pollux.usc.edu> Subject: Any SGML user group in Southern California? Hello, I need the contact point of the subject group. Any pointer would be appreciated. Thanks in advance. -- Regards - NamHoon YoO [nyoo@usc.edu] Newsgroups: comp.text.sgml Date: 08 Oct 1994 21:33:23 UT From: Steven Lindberg \ Organization: The World Public Access UNIX, Brookline, MA Message-ID: \ Keywords: Job posting Summary: Position at HarperCollins Publishers NYC Subject: Job posting HarperCollins College Publishers is looking for someone to assist with SGML and Quark production support on a consultant or freelance basis. The main task is to assist in-house staff with parsing and validation of SGML files and with the creation of search-and-replace routines that add XPress tags to the parsed files. Depending upon the individual's proclivities, the work might also involve some desktop production. The ideal candidate would have the following qualifications: - previous support or training experience - working knowledge of SGML - hands-on production experience with QuarkXPress and familiarity with XPress tags - in-depth knowledge of traditional typesetting - experience with programming or macro creation - proficiency on both Mac and PC Interested parties should contact Tim Armstrong, ETM Manager, HarperCollins College Publishers, at +1 212 207 7289 or fax a resume to him at +1 212 207 7703. Newsgroups: comp.text.sgml Date: 08 Oct 1994 23:13:55 UT From: Gary Houston \ Organization: Actrix Information Exchange Message-ID: \ References: \ <19941005T224732Z.erik@naggum.no> Subject: Re: multilingual HTML, SGML documents [Erik Naggum] | the connection between the SGML declaration and the characters actually | found in the entities that are read is already significantly weakened | by the fact that the entity manager must convert line endings into | record boundaries. all computer languages that wish to be used outside | of a single (type of) system must cater to the diversity of line | endings, and this SGML does well in principle, but we still have a | vestige of "codes" for the RS and RE "characters" when they really are | events like the Entity end "signal". this is very unfortunate, and | leads to a number of other minor problems. Yes, really they only need to say something like "an entity may be divided into lines according to the conventions of the computer system" and then talk about the semantics of empty lines etc., and use a different notation for short references (assuming these can not just be replaced with editing macros...). | the main problem is this: if the entity manager is free to translate a | small set of codes to other codes, and the SGML parser is only | concerned with what it receives from the entity manager, then the | entity manager might as well translate everything into a uniform code | that the SGML parser would be happy with. the SGML parser is only | concerned with the codes that means something for the parsing process, | anyway, and the application gets whatever the SGML parser considers to | be "not markup". it should be obvious that this scheme leaves a lot to | be desired in terms of identifying the codings involved. we note that | ESIS does not contain any provisions for identification of character | sets. again, ESIS is _not_ a good idea outside of conformance testing, | for which it was defined. | | to solve this problem, it is obvious that we have to sever the link | between the SGML declaration and the actual characters in the files (or | whatever). that is, each file (etc) read must identify its character | sets somehow, and the entity manager must then be responsible for | translation of this source character set into whatever the SGML parser | will be happy with. at this stage, it is inconsequential how the | coding at the file level looks, as long as the SGML parser is happy. | the result is, however, that the SGML declaration's character set stuff | is inconsequential, as well. In my ideal case I would use the same character set for every text file. There could at least be a default (as in the locale of the computer system) for a file for which a character set is not explicitly identified. It also seems more elegant if the identification of the character set can be done outside the file itself (imagine an operating system which provided a character set attribute for each file, or a database of some kind.) However I would believe you if you said this was not practical at present. Also when exchanging files between systems it may be desirable to use character set mechanisms provided by the transport layer and remove any character set declarations from files. [Sorry if my postings seem out of order: my news feed has random delays of 1 to 4 days or more for incoming articles.] Newsgroups: comp.periphs.printers,comp.text,comp.text.desktop,comp.text.sgml Date: 09 Oct 1994 16:42:37 UT From: Lars Bruzelius \ Organization: Upsala University Computer Center, SE Message-ID: <3796gf$12f4@caesar.udac.se> Subject: Correction: 15th Annual Xplor Electronic Document Conference & Exhibit Oops, I goofed. Dyslexica strikes again. The domain was missing in the site part of the URL. As a compensation to those who tried I can offer you a couple of pages on Xplor International; Non-impact Printers; and Maritime History if you follow the links from my home page at: http://pc-78-120.udac.se:8001/ The 15th Annual Xplor Electronic Document Conference and Exhibit will take place November 6-11, 1994, at the Phoenix Civic Plaza, Phoenix, AZ. For more information take at look at the WWW-server at: http://pc-78-120.udac.se:8001/www/Xplor/15th_Annual_Conference/Conference.html Most of the 64 pp conference announcement have been made available in hyper text format. Please note that this is not an official Xplor WWW-server. For additional information please contact: Xplor International, 24238 Hawthorne Blvd., Torrance, CA 90505-650, USA. Telephone: 1 800 669 7567; +1 310 373 3633; +31 2979 41 136 (European Office) Telefax: +1 310 375 4340 Internet: info@xplor.com Please say that you found it on the Web! -- Lars Bruzelius Uppsala University Computer Center Box 174, S-751 04 Uppsala, Sweden. Telephone: +46 (0)18 187731 Bitnet: uddri@seudac21 Telefax: +46 (0)18 516600 Internet: Lars_Bruzelius@udac.uu.se Newsgroups: comp.text.sgml Date: 10 Oct 1994 03:45:50 UT From: "W. Eliot Kimber" \ Organization: Passage Systems, Inc. Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> \ Subject: Re: Storage Models for SGML data [Todd G. Hivnor] | I'm wondering what's going on with storage models for hypertext SGML | data. I'm looking for a way to store large quantities of slowly | evolving SGML files. Specifically, the data in question is government | regulations, but the problem is probably more general. | | 3) An ability to maintain bi-directional links. (Suppose the | legislature inserts a paragraph, shifting all of the remaining | paragraphs down. How do I keep the pointers into the relocated | paragraphs valid?) [Peter Flynn] | This is much easier. Use referential mnemonics, not hard-wired | locations. [La]TeX has been doing this for over a decade, HTML for a | few years. I'm not sure what Peter means by "referential mnemonics". Do you mean ID references as opposed to something like structural position or byte offset within a given file? If so, then this has been part of SGML from the beginning. If, instead, you mean locations based on properties of objects other than their unique names, then HyTime's location address methods do this, specifically the ability to use queries that refer to the values of properties of objects (e.g., I want to find the CommandRef element whose CommandName sub-element exactly contains the value "fubar" and whose revision attribute value is greater than "1.1"). It is, of course, up to a given document management or retrieval system to provide support for the query-based addressing for which HyTime defines the syntax, but many systems already have this facility and it's merely a matter of mapping the HyTime syntax onto the functionality of a given system. And tools such as TechnoTeacher's MarkMinder/HyMinder product provide the HyTime-specific functions you'd need to do such an integration. However, you can't fix the "link maintenance" problem just by telling people to change their addressing method, because they often either don't have a choice or have compelling reasons for using specific types of addresses. This means that you need a more complete link management system that can track and manage links regardless of the addressing used or the ultimate objects located as anchors. This is not trivial, but people have been and are working on the problem. The difference is that before there was no interchange form that would allow the interchange of link information and linked documents among different management systems, but with HyTime there is. This means that document and link management systems need no longer be solely monolithic, but can potentially communicate at the data level (as opposed to the API or processing level) with other systems. Complementary specifications such as the Shamrock document management API promise some hope of program-level interchange as well. Ideally, I should be able to create interlinked HyTime documents, bring them into document/link manager X, modify them over time, export them (as HyTime documents, of course), import them into document/link manager Y and be able to continue the exact same work with the same level of control and understanding of the links in Y as I had in X without ever changing the source form of the documents or the links. The import and export should be very nearly transparent with no apparent change to the documents in Y from X. Systems X and Y may have very different ways of optimizing their internal representations of the links and addresses used with them, but to the world outside of X and Y, the data in the systems appears to be identical (where identity in SGML can only mean "have the same character stream"). -- \

W. Eliot Kimber (kimber@passage.com) Systems Analyst and HyTime Consultant Passage Systems, Inc., 9971 Quail Blvd., Suite 903, Austin TX 78758 +1 512 339 1400 465 Fairchild Dr., Suite 201, Mountain View, CA 94043, +1 415 390 0911 \
Newsgroups: comp.text.sgml Date: 10 Oct 1994 12:52:34 UT From: Fred Fenimore \ Organization: The News & Observer Message-ID: \ References: <372m54$37q@hermes.achilles.net> Subject: Re: word processor to SGML conversion [Peter Bomberg] | I would like to ask if there is a good listing of products in this | area? If so where and if not could anyone that has any info please | email me. I am specificaly interested in WP 5.2 and WP 6.0, Frame and | Word to SGML converters/translators. There is a pretty good jump page on the Web at http://info.cern.ch/hypertext/WWW/Tools/Filters.html Of course all of this info is related to getting from the various standard word processors to HTML so that might not be what you are looking for. Hope this helps, Fred -- --------------------------------------------------- | Fred Fenimore | The News & Observer | | | 215 S. McDowell Street | | fredf@nando.net | Raleigh, NC 27602 | | | (919) 829-4769 | --------------------------------------------------- | I speak only for myself and then only rarely | --------------------------------------------------- Newsgroups: comp.text.sgml Date: 10 Oct 1994 15:17:18 UT From: "Claude L. Bullard" \ Message-ID: <9410101517.AA44549@source.asset.com> References: \ <1994Sep26.233841.22334@midway.uchicago.edu> \ <1994Oct6.142331.11581@midway.uchicago.edu> <9410071859.AA49368@source.asset.com> Subject: Re: Multilingual HTML, SGML documents? I really enjoyed your comments. They have perspective and a healthy sense of humor. I apologize for the length of *this_one*, but I would rather reply to all of the comments in one post. |-) [Tim Bray] | Wrong. Wrong. Wrong. Wrong. The scope of a commercial application | is *never* defined in advance. If its designers have any brains, they | will assume that new and unforeseen demands will be placed upon it, and | new requirements will arise for the use of its data... The scope of | *military* applications is often attempted to be specified in advance. | That's why they so often fail and always cost more than is reasonable, | becuse the scope is guaranteed to be wrong. I may have overstated that, Tim. However, it is my experience in large design groups that the failure to *scope* the design is the single largest contributor to underschedule, underbudget and underdeliver applications. IMO, this is a fundamental of applying view dimensions to the analysis of complex systems. On the famous other hand, I think that this also has a lot to do with the overall maturity of the design group itself. However, a well-written and thought through design spec is a requirement for many parts of a project especially those organized around Work Breakdown Structures, etc. It's hard to bill against good intentions. Oddly enough, the design spec document system was created to keep costs down. Talk about *good intentions*. |-) Military applications are, in the perceptions of commercial design, overbuilt. But, this is a result of the extraordinary environments in which these applications are intended to function. Only the military requires that a PC operate after being submerged for four hours in salt water or that an LED screen be perfectly readable in the mid day sun of a desert. Does this sort of thing get overdone? Yes, but it was this sort of overdoing that enabled the Patriot to perform as an ABM when it was only designed to be a SAM. No one foresaw that requirement as far as I know. But, as you point out, "if its designers have any brains, they will assume that new and unforeseen demands will be placed upon it". They do it with a broad specification negotiated with the customer who must figure both cost and flexibility into the design. A lot of good ideas go to the O-file because of cost. [Peter Flynn] | Who was it authorised the (doubtless apochryphal) mass program of | stable- building in the last century "because the number of horses | required by the US Army will increase X-fold in the next 100 years" or | some such? Probably the same ilk who approved that long underground tunnel in Texas for creating a *supercollider* to smash the final subatomic particle before the rest of the world could get ahead in the race to see who will be first to make the most of nothing. It always depends on who is *smashing* what against the brick wall. Often if designers are left to their own devices, nothing from something is what one gets for their dollars. BTW, I am a designer, not management. I didn't want to gray prematurely. |-) [Richard L. Goerwitz] | I would only add that the Web is not, by nature, a traditional, | localized, commercial product. It aspires to much more than | this....Part of this vision is that of a globalized information | resource. If such a resource were confined to documents written in | Latin script, my feeling is that it would not be worthy of the title | "world wide". I agree on both points. But it is the aspiration that I question. Aspirations are human activities, not the actions of artificial systems; so, are these human aspirations realistic? Should the WWW builders be required to create character set support for the language of the tribes of the African interior with the "click" on the vocables? If the tribes become WWW users, one might say "certainly". But if only a few thousand users can read or understand that set, is it worth the effort? I don't have an answer for that. What I do agree with wholeheartedly is that if the WWW is to realize even a small percentage of those aspirations, cheap and easy to implement mechanisms for the character sets must be found and agreed upon. Excluding cultural groups deprives us of too much and forcing them into the Western languages and the consequent mind set impoverishes all. Yet after mulling these issues over the weekend, these are what I think will become requirements for the WWW: 1. Mechanisms (legal or automated) for registering different content types (in the SGML buzz, applications a la DTDs) should be a high priority for the IETF. SGML was designed with that in mind and I should like to be able to continue the practice. HTML is fine for most of what is intended for the Web, but if *unanticipated* applications are to be practical and cost-effective, and the global environment is to foster competition as well as evolution, then we must be allowed to use many of the same mechanisms available on the WWW with different data models and processing strategies. If we can all agree to use an efficient subset of the HyTime addressing and location models, these applications can interoperate to that degree which their designers deem necessary. 2. This will be controversial. It may be necessary for the WWW to prune its nodes just as the human brain reassigns segments of the neuronal *wiring* when they are unaccessed for some amount of time. Pruning may be a local community activity where the *local community* is defined by the laws of the country in which the node resides. 3. If one does not already exist, there should be an international mechanism subject to the *local community* whereby one is *licensed* to become a WebSite. This should not be overly restrictive and can be comparable to existing regulations for becoming a licensed ham (shortwave) radio operator, but it can help to prevent abuse of the network and its users. I don't believe such requirements would interfere with the goals of a global information resource. They would reinforce the primacy of international regulation of a global resource for the betterment of all who service and are served by it. IMHO, it has been the ability of the international community to do just this that has enabled us to awaken from the nightmare that I cited in my last post. Perhaps then, we can hold on to both the dream and the bright morning. Thank you for your patience. Len Bullard Newsgroups: comp.text.sgml Date: 10 Oct 1994 15:28:10 UT From: Richard Wal Matzen \ Organization: Oklahoma State University, Computer Science Message-ID: \ Subject: Re: Anyone have any info about WORD's recent SGML release? Regarding Microsoft(R) SGML Author for Word: [Mark Walter] | In simple terms, it maps Word document template styles to SGML elements | and attributes, but, in fact, it handles quite a few of the messy | instances where these two document representations do not map on a | one-to-one basis. It covers everything you can do in Word, including | math and tables. This seems to imply that the formatting info associated with the styles is output to the SGML file. However, in all of the Microsoft press releases I havent seen any references to style info being output. So my questions are: Does anyone know if/how this product handles the exchange of style (formatting) info? Is any style info output? If so, how complete is it and in what form is it: attributes? similar to Rainbow DTD? Also, same question for WordPerfect's SGML product. -- Rick Matzen \ Newsgroups: comp.text.sgml,comp.os.ms-windows.apps.word-proc,bit.mailserv.word-pc Date: 10 Oct 1994 17:25:43 UT From: Howard Kaikow \ Organization: MV Communications, Inc. Message-ID: \ References: \ Subject: Re: Anyone have any info about WORD's recent SGML release? To clarify some earlier post of mine. My concern is that Word has no way to distinguish certain commonly needed elements in a style. My favorite example is its inability to use dictionary style definitions for a numbered clause and not lose the capability of automagically updating the table of contents and using the Insert Cross Reference facility to refer to the clause's title. Many documents would have a need for three tags, one for dictionary style definition; another for the usual numbered clause, followed by a separate unnumbered paragraph; the third for a numbered clause without a title. Word cannot do the first and third, without losing the capability of updating the table of contents and using cross references properly. I pointed out these shortcomings to Microsoft in, as I recall, January 1993, perhaps Word 7.0 will have these and other problems fixed, one can only hope. Newsgroups: comp.text.sgml Date: 10 Oct 1994 19:03:05 UT From: Kevin Brennan \ Organization: Honeywell ATS Message-ID: <37c359$j04@bmw.hwcae.az.Honeywell.COM> References: <374cqn$3l8@bmw.hwcae.az.Honeywell.COM> <3762vk$irc@deep.rsoft.bc.ca> Subject: Re: SGML "Viewers" [Kevin Brennan] | We're particularily interested in AIr Transport Association (ATA) | SGML compliant viewers with Temporary Revision capability. [Tim Bray] | What's Temporary Revision capability in a viewer? Inquiring minds | want to know. Temporary Revision capability allows the using orginazation to provide revisions to the text contained on the CDROM without cutting a new CDROM. For example, an airline using maintenance manuals on CDROM may receive revisions to the manual *between* issuances of new CDROMS. Jouve's software puts a revision file on the local hard drive and replaces or annotates text appropriately. In the case of a networked resource, there can be a shared file on the network. There of course, is a seperate TR software (optional at some cost) that the Maintenance Publication department would have to acquire. K :-) Newsgroups: comp.text.sgml Date: 10 Oct 1994 20:57:10 UT From: Cheryl Conway \ Message-ID: <9409117818.AA781866386@njcorp.akbs.com> Subject: Change of employer & contact info for Cheryl (Ventura) Conwa I am posting this by the way of introducing myself to comp.text.sgml... and also in the hopes that my former colleagues will see this, and update their information for me, so I once again get onto all of the SGML- related mailing lists, etc. I am Cheryl (Ventura) Conway and my work, for the past 10 years has been in: - on-line (formerly aircraft) maintenance manuals, - SGML and electronic document processing systems, - conversion of paper - or "legacy data" - document libraries to on-line, SGML-encoded, hypertext form, - and integrating them with AI diagnostic systems or data acquisition systems (for device-under-test) My FORMER position, employer, and address, phone, etc. were: Cheryl Ventura Conway, Staff Engineer AlliedSignal Guidance & Control Systems Division mailstop G/11 or M/11 or F/12, Teterboro, NJ 07608 +1 201 393 3842 I have recently changed jobs to a better position. My NEW information is: Cheryl Conway, Consultant (no longer using maiden name) Advantage KBS 1 Ethel Road, Suite 106B Edison, NJ 08817 cconway@akbs.com, Internet +1 908 287 6494 x 253, phone +1 908 287 3193, FAX I am continuing to work in the same areas, only now applied to maintaining communications (e.g., telephone) devices and services, both hardware and software, rather than (applied to) primarily hardware aircraft maintenance. Advantage KBS markets a particularly effective on-line help tool based on an integration of SGML-encoded on-line hypertext manuals, integrated with an AI diagnostic engine. If you've previously had contact information for me, or had me on your mailing list -- please do me the favor of updating your information for me so I can get myself re-connected to the SGML world! Regards, Cheryl Conway cconway@akbs.com Newsgroups: comp.text.sgml Date: 10 Oct 1994 21:34:54 UT From: "John Vernon" \ Organization: METASOFT Message-ID: \ References: <36p76b$c1@erinews.ericsson.se> <1994Oct5.143731.20589@ast.saic.com> Subject: Re: SGML to Postscript formatter [Bob Agnew] | FOSIs are supported by Arbortext and DataLogic. DSSSL? Can you give me any contact details for any of these two companies? I am looking for an SGML to Postscript converter. The main concern is ease of use, (e.g., FOSI defined style info is fime, but something simpler in addition would help), performance and availability across platforms. Can you point advise me where I might find such a converter? Much obliged John Vernon Newsgroups: comp.text.sgml Date: 10 Oct 1994 22:35:06 UT From: BlueSky \ Message-ID: \ Subject: SGML parsers for UNIX/public domain ?? Hi, I am intrested in finding if there are any public domain parsers that work on Unix, that checks the complines the text instance with dtd ?? Any information will be greatly appreciated Thankyou RAVI bluesky@netcom.com -- ______ BMW 325i: I love my Car: /______\\ bluesky@netcom.com FLY OUR FRIENDLY SKIES (oo=OO=oo) +1 415 634 4002 []""""""[] __|__ __|__ __|__ *---o0o---* *---o0o---* *---o0o---* | __|__ ____|____ --o--o--(_)--o--o-- -o---O--( . )--O---o- ___________ | _ _|_ _ (_)-/ \\-(_) _ /\\___/\\ _ (_)_______________________( ( . ) )_______________________(_) \\_____/ Newsgroups: comp.text.sgml Date: 10 Oct 1994 23:06:31 UT From: Peter Flynn \ Organization: University College Cork Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> \ \ Subject: Re: Storage Models for SGML data [Peter Flynn] | This is much easier. Use referential mnemonics, not hard-wired | locations. [La]TeX has been doing this for over a decade, HTML for a | few years. [W. Eliot Kimber] | I'm not sure what Peter means by "referential mnemonics". Do you mean | ID references as opposed to something like structural position or byte | offset within a given file? If so, then this has been part of SGML | from the beginning. If, instead, you mean locations based on | properties of objects other than their unique names, then HyTime's | location address methods do this, specifically the ability to use | queries that refer to the values of properties of objects (e.g., I want | to find the CommandRef element whose CommandName subelement exactly | contains the value "fubar" and whose revision attribute value is | greater than "1.1"). I mean the second: sorry, I was typing fast :-) A better-known usage might be LaTeX's way of using user-made labels as pointers, although these lack the richness of HyTime's location addressing. I'm still looking for systems which can replicate the canonical references for me tho... ///Peter -- Peter Flynn | pflynn@curia.ucc.ie | ...persuade users that spreading Computer Centre | +353 21 276871 x2609 | fonts across the page like peanut University College | +353 21 277194 (fax) | butter across hot toast is not the Cork, Ireland | Opinions are my own. | route to typographic excellence... Newsgroups: comp.text.sgml,comp.os.ms-windows.apps.word-proc,bit.mailserv.word-pc Date: 10 Oct 1994 23:15:10 UT From: Peter Flynn \ Organization: University College Cork Message-ID: \ References: \ \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Howard Kaikow] | My concern is that Word has no way to distinguish certain commonly | needed elements in a style. My favorite example is its inability to | use dictionary style definitions for a numbered clause and not lose the | capability of automagically updating the table of contents and using | the Insert Cross Reference facility to refer to the clause's title. | | Many documents would have a need for three tags, one for dictionary | style definition; another for the usual numbered clause, followed by a | separate unnumbered paragraph; the third for a numbered clause without | a title. Word cannot do the first and third, without losing the | capability of updating the table of contents and using cross references | properly. I cannot conceive of trying to do this with a visual-structure system like Word: if printed output is the principal goal, a logical-structure system like LaTeX would be the natural choice. Why make life difficult? ///Peter -- Peter Flynn | pflynn@curia.ucc.ie | ...persuade users that spreading Computer Centre | +353 21 276871 x2609 | fonts across the page like peanut University College | +353 21 277194 (fax) | butter across hot toast is not the Cork, Ireland | Opinions are my own. | route to typographic excellence... Newsgroups: comp.text.sgml,comp.os.ms-windows.apps.word-proc,bit.mailserv.word-pc Date: 11 Oct 1994 01:40:06 UT From: Howard Kaikow \ Organization: MV Communications, Inc. Message-ID: \ References: \ \ \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Peter Flynn] | I cannot conceive of trying to do this with a visual-structure system | like Word: if printed output is the principal goal, a logical-structure | system like LaTeX would be the natural choice. Why make life | difficult? Well, life happens to be difficult. I am forced to use Word to deal with many people that I have to send documents to or to share editing tasks with. Word is not the right tool to use, but it has become a de facto standard in many areas. Personally, I would much rather do everything in SGML, perhaps the environment I am in will be there some day, not yet tho. Newsgroups: comp.text.sgml Date: 11 Oct 1994 06:14:03 UT From: "W. Eliot Kimber" \ Organization: Passage Systems, Inc. Message-ID: \ References: \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Mark Walter] | In simple terms, it maps Word document template styles to SGML elements | and attributes, but, in fact, it handles quite a few of the messy | instances where these two document representations do not map on a | one-to-one basis. It covers everything you can do in Word, including | math and tables. [Richard Wal Matzen] | This seems to imply that the formatting info associated with the styles | is output to the SGML file. However, in all of the Microsoft press | releases I havent seen any references to style info being output. So | my questions are: : | Does anyone know if/how this product handles the exchange of style | (formatting) info? Is any style info output? If so, how complete is | it and in what form is it: attributes? Similar to Rainbow DTD? The above inference is incorrect. The quoted quote makes it sound like the mapping from styles to SGML is *automatically* defined, but this is not the case. Someone (you or some specialist) must define the mapping from a given set of styles to a particular SGML document type. As a rule, the style definition for an SGML document is separate from the document itself (e.g., FOSIs, DynaText style sheets, etc.) so it is generally not meaningful to talk about style exchange in this context (e.g., going from creating a structured document in Word that is exported as SGML). In other words, the purpose of Word's SGML tools is not really to allow you to export any arbitrary Word document in some sort of equivalent SGML form, but to allow you to create in Word documents that conform to a specific document type. The same is true of Word Perfect's Intellitag. However, there's nothing that prevents you from defining a Rainbow-type DTD that has the elements and/or attributes needed to capture the style information that might be in Word documents, such as paragraph-level format changes. However, it would be up to you to define both the DTD and the mappings to it from Word -- Microsoft cannot provide that out of the box because every case will be different. I saw a demo of this product at Seybold and it seemed to have some interesting features. I am hesitant to characterize it, but from what I saw it seemed to fall somewhere between Intellitag and FrameBuilder in terms of support for structured authoring and its import and export facilities. The Microsoft representatives were careful to characterize it as a first step in a continuum of structure- and SGML-based authoring tools. They have, for example, partnered with SoftQuad to provide a true SGML editor (a stripped-down version of Author/Editor) for doing post-export cleanup. The Microsoft people impressed me with their candor and their understanding of SGML and the limitations of their product. They appear to have avoided the mistakes of their predecessors and have not tried to oversell what they're offering. From what I've seen, it appears to be a tool that can fit into a larger SGML-based strategy that includes a variety of tools, SGML and non-SGML, appropriate to specific users and tasks. -- \
W. Eliot Kimber (kimber@passage.com) Systems Analyst and HyTime Consultant Passage Systems, Inc., 9971 Quail Blvd., Suite 903, Austin TX 78758 +1 512 339 1400 465 Fairchild Dr., Suite 201, Mountain View, CA 94043, +1 415 390 0911 \
Newsgroups: comp.text.sgml Date: 11 Oct 1994 06:32:58 UT From: "W. Eliot Kimber" \ Organization: Passage Systems, Inc. Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> \ \ Subject: Re: Storage Models for SGML data [Peter Flynn] | I mean the second: sorry, I was typing fast :-) A better-known usage | might be LaTeX's way of using user-made labels as pointers, although | these lack the richness of HyTime's location addressing. I'm not a TeX user, so I'm not familiar with its addressing features, but I'm wondering how a "user-made" label differs from an SGML unique identifier. | I'm still looking for systems which can replicate the canonical | references for me tho... If I understood your requirement, you're specifing a query along the lines of: "find the 4th paragraph that is a descendant of the 2nd Section within the 3rd Chapter." If that's the case, then you can express it as a HyQ query certainly, as well as queries in a number of other document querying systems, such as OpenText's PAT and AIS' SGML Search. You could even define your own syntax as a new "query notation" and use it within a HyTime addressing context, e.g.: \ \ \ \ ]> \ ... \Check out this paragraph\ \\Para4 in Sect2 in Chap3\ ... \ You could define the DTD to give you more tag minimization to reduce the keystrokes somewhat. For example, you could create a specialized named location element type that can only contain a CanonicalRef element, which would allow the following minimization: \Para4 in Sect2 in Chap3\ Which might be more intuitive for some authors. It's important to remember that any form of addressing can be integrated within a HyTime framework using either name queries or notation-specific locations (which are a form of query), which means that in essence your addressing functions are limited only by the power of your retrieval system, not by some inherent limitation in SGML or HyTime. Of course this example of using a specialized query notation has some problems with interchange if support for this form of query is not wide spread, but that problem will always exist because no single solution can solve all problems. In this example, I assume that the need for this form of reference is more compelling than a need to enable query interchange. -- \
W. Eliot Kimber (kimber@passage.com) Systems Analyst and HyTime Consultant Passage Systems, Inc., 9971 Quail Blvd., Suite 903, Austin TX 78758 +1 512 339 1400 465 Fairchild Dr., Suite 201, Mountain View, CA 94043, +1 415 390 0911 \
Newsgroups: comp.text.sgml,comp.os.ms-windows.apps.word-proc,bit.mailserv.word-pc Date: 11 Oct 1994 08:52:02 UT From: Peter Flynn \ Organization: University College Cork Message-ID: \ References: \ \ \ \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Howard Kaikow] | Well, life happens to be difficult. I am forced to use Word to deal | with many people that I have to send documents to or to share editing | tasks with. Word is not the right tool to use, but it has become a de | facto standard in many areas. Sure, I wasn't attaching any blame to you, just continuing bafflement at how these decisions get made in the light of all the evidence :-) | Personally, I would much rather do everything in SGML, perhaps the | environment I am in will be there some day, not yet tho. I dunno, there are a lot of jobs more suited to Word in an office environment right now, while SGML software remains relatively underused. ///Peter -- Peter Flynn | pflynn@curia.ucc.ie | ...persuade users that spreading Computer Centre | +353 21 276871 x2609 | fonts across the page like peanut University College | +353 21 277194 (fax) | butter across hot toast is not the Cork, Ireland | Opinions are my own. | route to typographic excellence... Newsgroups: comp.text.sgml Date: 11 Oct 1994 13:09:00 UT From: "Bart Souren" \ Organization: Tilburg University / The Netherlands Message-ID: <19941011140936.B.Souren@sp0294.kub.nl> Subject: Looking for SGML/HTML final project. I am a student at the Faculty of Arts of the Tilburg University in Holland. My specialisation area is "Text and Computers", which among other things deals with SGML and HTML. I am looking for a SGML/HTML project for my final paper. From november till february/march I will be busy with a teaching practice at Wolters Kluwer Publishers, which will deal with SGML. After my teaching practice I will have to do a final project, which I prefer to do abroad. Especially I prefer european countries and Australia. From my department at Tilburg University I will have to do my final project at a university or scientific institute. So if you know a university or scientific institute where it would be possible to do a final project, please let me know. Keep in touch, -- Bart Souren Bisschop Zwijsenstraat 26 5038 VB Tilburg The Netherlands Newsgroups: comp.text.sgml Date: 11 Oct 1994 13:31:07 UT From: Charles Hurley \ Organization: PANIX Public Access Internet and Unix, NYC Message-ID: <37e42r$mop@panix.com> References: <3762qj$4p6@finnegan.iol.ie> Subject: Re: YASP, does anyone have a MS-DOS copy? [Sean Mc Grath] | I have looked through the compilation scripts etc. for YASP but I do | not have access to an OS/2 box to compile up a DOS version as required. | I guess with a bit of work the REXX script could be analysed to produce | a DOS makefile for Microsoft or Borland and the code compiled directly | on a DOS based PC. If someone has already been done, I would | appreciate any information. If it has not been done, we will do it and | post details here. | Either which way, a DOS binary of YASP compiled using the existing OS/2 | hosted compilation process would be very useful. If you are willing to write the code, I will compile it on Borland C++ or DJGPP. I don't know if I can be any greater help than that. -- % Charles R. Hurley Phone: +1 212 633 3759 % Publishing Technologies Coordinator (TeX Projects) Fax: +1 212 633 3685 % Elsevier Science Inc. E-mail: churley@panix.com % 655 Avenue of the Americas, New York, N.Y. 10010-5107 Newsgroups: comp.text.sgml Date: 11 Oct 1994 21:13:29 UT From: Baohuan Zhang \ Organization: Computer Science, Liverpool University Message-ID: \ Keywords: SGML DTD Subject: How to make a SGML DTD document? Hi, I need to make a SGML DTD document. I would appreciate if anybody can give me some examples of DTD. Many Thanks, Baohuan Newsgroups: comp.text.sgml Date: 11 Oct 1994 22:58:07 UT From: Darin Duvall \ Message-ID: <9409117819.AA781919887@cc-mail.agcomed.uiuc.edu> Subject: Accessibility and access to the Internet Fellow information providers: I've been pretty good about reading all of my newsgroups lately, and I haven't read anything that would suggest that adopting PDFs as a standard would have any sort of harmful effect (see messages below). Did I miss that part? Adobe's PDFs are an OPEN FILE FORMAT. Anyone can write an application to read PDFs. Anyone can write a plug-in that reads PDFs into their existing application. A PDF could be routed to a speech synthesizer or a braille printer just as easily as any other text file, as long as you have a filter for it. [Yuri Rubinsky] | The idea that the page image -- one-time, non-reusable, non-adaptable, | etc -- should be considered a valuable archival or display form is | frightening. Sounds like Yuri should try out Acrobat before jumping to any conclusions. None of the statements above apply to PDF documents. You can copy the text and/or graphics from a PDF and paste them into any Mac or Windows application. You could even write an application that lets you modify a PDF. A PDF is not just a page image. It's the original text and graphics stored along with the formatting information that makes it look consistent across platforms/ configurations. [Mike Paciello] | ...if ADOBE has their way. Once again, may (sic) folks will be left | without true access to information, particularly on the NII. What is Mike talking about? Acrobat's technology is about *increasing* access. For the first time ever, the average person can distribute nice looking documents that incorporate hotlinks and graphics, and different views. The idea that I may one day be able to access all Federal documents in PDF form is WONDERFUL! An indexed document that lets me click on a topic an jump right to that info... Our only current alternative is SGML, and there are NO tools that let ordinary people create meaningful SGML documents. SGML certainly has its place, but not as the defacto standard for document exchange. SGML is too complicated and limited for such broad use. Let's not let a few SGML die-hards spoil an opportunity for the rest of us. Mike, the Acrobat Readers are all FREE! I downloaded the Mac, DOS, and Windows versions today! They are on the Net and every major on-line service. How is that limiting access? Mosaic documents can call up PDF files. A PDF can have a link to a Mosaic file. How is that limiting access? If there is an opportunity for Adobe to make some money while we citizens get some really cool information technology, I'm all for it. If there are legitimate concerns about information access, I'm all ears. You can convince me to change my viewpoint with facts, but so far I'm seeing a whole lot of hollering and hype, and I don't see the fire... --------------------------------------------------------------------------- [Mike Paciello] Hi Folks! Please distribute the attached mail on a many listservs as possible. This is a MAJOR concern if ADOBE has their way. Once again, may folks will be left without true access to information, particularly on the NII. ICADD is working hard to block this, as well as the SGML Consortium. We hope to get your support. Please note that I will be at the Mosaic '94 conference next week and at the Internet '94 Conference in December, to ensure that we have an equal voice in accessibilty on the NII. Thanks for your support, Regards, Mike Paciello Program Manager Vision Impaired Information Services Digital Equipment Corporation [Yuri Rubinsky] I understand that the US National Institute of Standards and Technology, supported by dollars from the Dept of Defense and Adobe has done a study suggesting that Adobe's proprietary PDF format be made a US Federal Information Processing Standard for electronic viewing. My feeling is that this would throw back the cause of accessible documents immeasurably. The idea that the page image -- one-time, non-reusable, non-adaptable, etc -- should be considered a valuable archival or display form is frightening. I have heard all this just today, and very third hand, but I understand that Oct 17th is the last day for opposing viewpoints to be heard. I encourage those of you who are American, and taxpayers, to make your feelings known. The SGML Open Consortium will certainly do what it can in this area. If you have feelings in this area, or want to know what's going on, contact Mike Rubenfeld at NIST, +1 301 975 3064. Please keep the rest of this list informed too. Yuri Rubinsky --------------------------------------------------------------------------- Sincerely, -- Darin Duvall, Electronic Publishing Specialist University of Illinois _______________________ College of Agriculture | voice: +1 217 244 2832 Information Services | fax: +1 217 244 7503 47 Mumford Hall | 1301 W. Gregory Drive | e-mail: duvall@uiuc.edu Urbana, IL 61801 | Newsgroups: comp.text.sgml Date: 11 Oct 1994 23:01:00 UT From: Peter Kerr \ Organization: School of Music University of Auckland Message-ID: \ References: \ <1994Sep26.233841.22334@midway.uchicago.edu> \ <1994Oct6.142331.11581@midway.uchicago.edu> <9410071859.AA49368@source.asset.com> <9410101517.AA44549@source.asset.com> Subject: Re: Multilingual HTML, SGML documents? Whatever happens to the Web we need guys like Claude: Article <9410101517.AA44549@source.asset.com>, "Claude L. Bullard" \ out there batting for common sense and human decency! -- Peter Kerr bodger School of Music chandler University of Auckland neo-Luddite Newsgroups: comp.text.sgml Date: 12 Oct 1994 10:33:07 UT From: Arthur Ogawa \ Organization: TeX Consultants Message-ID: \ References: <9409117819.AA781919887@cc-mail.agcomed.uiuc.edu> Subject: Re: Accessibility and access to the Internet [Darin Duvall] | I've been pretty good about reading all of my newsgroups lately, and I | haven't read anything that would suggest that adopting PDFs as a | standard would have any sort of harmful effect (see messages below). | Did I miss that part? First, many thanks to Darin for posting this note; I hadn't seen anything else to date. I intend to contact the NIST with my concerns, and I hope others in this newsgroup do likewise, regardless of which side of the question you may fall on. The crucial issue here, as I see it, is the nature of the document. Adobe's PDF (of which I am an enthusiastic supporter) is a fine *page description language*, which is not to say that it qualifies as a "primary document". Rather, it is a file format for a derived document, which can be viewed, printed, browsed, and possibly even handed over to a system that allows it to be "read out" for, say, a print-impaired person. The ideal document management system would be one in which a primary document could be served up to a subscriber in whatever form is most convenient for that person. However, PDF does not fulfill the criteria for a primary document. I rather picture the PDF being generated on the fly from an SGML document. That said, it is possible that Adobe will someday achieve something in PDF that will make it qualify as a primary document. In my mind that quality is that one can recover the original, validatable SGML document from the PDF document via some sort of filter. And I am encouraged to look for that outcome, because PDF has been supported and further developed by Adobe since its introduction 2 years ago. It is now possible (in the revised Acrobat 2.0) to do text searches in a document with considerably more power than before. And it has always been possible to prepare the PDF document with structured comments which could in principle represent the SGML markup in the original. So in some limited sense, PDF is usable as a primary document, but only to the extent that it fully supports this "SGML transparency". So, what I am saying is that the technical issue around PDF is whether it is capable of the expressive power needed to allow this recovery of the original document. And sad to say, presently it does not provide full capabilities. I have heard, by the way, of the possibility of some Avalanche-provided SGML capability for PDF. I have not heard enough about it to get very enthusiastic, though. | Adobe's PDFs are an OPEN FILE FORMAT. Anyone can write an application | to read PDFs. Anyone can write a plug-in that reads PDFs into their | existing application. A PDF could be routed to a speech synthesizer or | a braille printer just as easily as any other text file, as long as you | have a filter for it. Sure PDF is great, but I think what many of us are concerned about is that it is insufficient to present a formatted document to a print-impaired reader. Those who have worked in the field are striving for a way to deliver much richer information to such a person. In effect she will be able to browse the document structurally if she so chooses. We who can easily read print do take a lot for granted. We can describe the structure of a document just by glancing at its visual presentation. Such is the miracle of the typesetter's art and our own visual cortex. But try working with a document that is poorly formatted, or in which the structure is obfuscated in some way or other -- and you'll get a taste of what a task it is for a print-impaired reader to process any document. [Yuri Rubinsky] | The idea that the page image -- one-time, non-reusable, non-adaptable, | etc -- should be considered a valuable archival or display form is | frightening. [Darin Duvall] | Sounds like Yuri should try out Acrobat before jumping to any | conclusions. None of the statements above apply to PDF documents. You | can copy the text and/or graphics from a PDF and paste them into any | Mac or Windows application. You could even write an application that | lets you modify a PDF. A PDF is not just a page image. It's the | original text and graphics stored along with the formatting information | that makes it look consistent across platforms/ configurations. Yuri's point is tersely stated, and perhaps not compelling to some. (And I'll wager that he *has* tried Acrobat out.) Yes, you can mung up an Acrobat document. Yes, it is accessible (being mostly ASCII). But I don't agree that Acrobat is more than a page image (well...page *images* ;-). A bitmap it's not, true. But inside the PDF, you'll look in vain for a statement analogous to "beginning of chapter" or "book title starts here" or the like. After examining the PDF standard, I conclude that inferring structure out of a PDF is presently no easier than doing so out of, say, a MicroSoft Word document. And that failing is the big problem. [Mike Paciello] | ...if ADOBE has their way. Once again, may (sic) folks will be left | without true access to information, particularly on the NII. [Durin Duvall] | What is Mike talking about? Acrobat's technology is about *increasing* | access. Granted, Adobe intends to increase access -- to sighted readers. It's the others that one is concerned with. I think that those in this newsgroup can appreciate the gap between Adobe's current technology and what's necessary for high-utility access to a document. | For the first time ever, the average person can distribute nice looking | documents that incorporate hotlinks and graphics, and different views. | The idea that I may one day be able to access all Federal documents in | PDF form is WONDERFUL! An indexed document that lets me click on a | topic an jump right to that info... I too am turned on by the concept of widespread access to Federal documents. But I don't see Acrobat as an adequate form for representing that document in its primary form. I think that if someone wants to view such a document in the form of a PDF, then that person should be able to do so. But let's keep the richness of the document's structure in the primary document, so it can be utilized by those capable of doing so. I would favor the original document in SGML form, with PDF delivery an option. | Our only current alternative is SGML, and there are NO tools that let | ordinary people create meaningful SGML documents. SGML certainly has | its place, but not as the defacto standard for document exchange. SGML | is too complicated and limited for such broad use. Let's not let a few | SGML die-hards spoil an opportunity for the rest of us. I don't quite see things this way. First, SGML is not the only alternative to PDF. There are plenty of further alternatives, and none measures up the the standard of information richness that SGML provides. Second, the issue in certain areas might be whether ordinary people can create meaningful documents, but here we are talking about the NIST and the Federal government. It is they who will be creating these documents. So as far as ordinary people go (and lots of the unsighted are ordinary people, you must admit), it's a question of information delivery. And for some of us, PDF won't work well. Third, document exchange is not really an issue here, either. It is document *delivery* that Darel seems to be addressing. If a government document is stored in SGML form, this does not in and of itself limit access to that document. When a request for that document is received, one option may be to deliver it in its archival form, SGML, another option may be PDF. Too complicated? Maybe to someone who has to read the SGML document with an ASCII text editor. But I don't see this (time-honored, archaic) way as the way of the future. Limited? The only practical limitation of SGML on the delivery side (that is, once the SGML document exists) is availability of a dirt-cheap browser for folks who don't like to pay real money for the software they use. Back when Acrobat was first announced, Adobe took a lot of flak for having the timerity to suggest that each reader might want to pay $25 for a Viewer. Now that the price has gone to $0.00 (plus the cost of net.access ;-), Acrobat receives Darin's enthusiastic vote. Heh. (The thing that will tip the scales of the public mind in favor of SGML might be something that only has a Name For Now, Yuri.) | Mike, the Acrobat Readers are all FREE! I downloaded the Mac, DOS, and | Windows versions today! They are on the Net and every major on-line | service. How is that limiting access? I love to surf, too, but I am aware that net.access is by no means the common way of getting information. NIST may provide potentially universal access by providing documents online, but let's not kid ourselves that this amounts to anything at all like *universal* access in fact. Meanwhile, if they do chose PDF as the archival form of the document, and turn their backs on SGML, they effectively cut out a large segment of the population (who fall in a category called "the disabled"). This is the thing I don't want. I and others have been working for quite some time to increase access to information to people in precisely such a way that there are *fewer* inherent barriers. SGML per se does not erect barriers. It *can* eliminate some. By the way, Darin, you picked up Mac, DOS, and Win versions, right? Did you not feel a twinge of sympathy for the poor souls who are confined to a Unix box? Huh? | Mosaic documents can call up PDF files. A PDF can have a link to a | Mosaic file. How is that limiting access? If there is an opportunity | for Adobe to make some money while we citizens get some really cool | information technology, I'm all for it. Me, too. I'm heavily invested in the continuing success of numerous Adobe products, so, yes, I do want to see them make money. Of course, in this matter I have learned to trust them to look out for their own interests quite effectively. ;-)  I just wonder about their giving away copies of the Acrobat Reader, you know. High up-front development costs and then -- just where do they get the bucks out of this? Makes me want to sell my Adobe stock, it does. ;-) Sorry, couldn't resist. | If there are legitimate concerns about information access, I'm all | ears. You can convince me to change my viewpoint with facts, but so | far I'm seeing a whole lot of hollering and hype, and I don't see the | fire... And so, convincing people is what this note is for. I hope it was worthwhile reading. But I'm a little disturbed by your use of 'legitimate' here. I hope that concerns raised on behalf of print-impaired people and those whose ideas of information access are more demanding than yours, even though not directly impacting yourself, are nonetheless not labeled lacking in legitimacy. Indeed it's *not* just a matter of information access by a certain fraction of the disabled. SGML documents will enable a higher level of access to all people. At a time when the amount of information published by the Federal government is prodigious already and increasing at a breathtaking pace (doubling roughly every two years, I've heard), having that information in a digestible form is equally important to having access at all. Which means, I'm saying, that access via a page description language is, for some purposes, no access at all. Just imagine doing a computer-aided analysis of a Federal document in PDF form. PDF should not be the primary document's representation; that should be SGML's role. PDF should be offered as one delivery vehicle of many. [Mike Paciello] | ICADD is working hard to block this, as well as the SGML Consortium. | We hope to get your support. For your information, ICADD is a group specifically engaged in broadening access to information by print disabled persons. I suggest that someone with direct knowledge of the organization might wish to post some information saying exactly why they are working hard against the proposal in question. [Yuri Rubinski] | If you have feelings in this area, or want to know what's going on, | contact Mike Rubenfeld at NIST, +1 301 975 3064. And I intend to do just that. I hope others will as well. -- Arthur Ogawa/Publishing Consulant/TeX Consultants ogawa@teleport.com/voice:+1 209 561 4585/fax:4584 Mail: Kaweah CA 93237-0051 PGP-Key: "finger -l ogawa@teleport.com" Newsgroups: comp.text.sgml Date: 12 Oct 1994 17:28:54 UT From: Bowden Wise \ Organization: Rensselaer Polytechnic Institute Computer Science, Troy NY Message-ID: <37h6cm$7f1@usenet.rpi.edu> Subject: Where are DTDs for HTML? Where can I find the DTDs for the various implementations of HTML (e.g., 1.0, 2.0, Plus)? Newsgroups: comp.text.sgml Date: 12 Oct 1994 17:39:40 UT From: "Claude L. Bullard" \ Message-ID: <9410121739.AA14593@source.asset.com> References: \ \ \ \ \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Howard Kaikow] | Personally, I would much rather do everything in SGML, perhaps the | environment I am in will be there some day, not yet tho. [Peter Flynn] | I dunno, there are a lot of jobs more suited to Word in an office | environment right now, while SGML software remains relatively | underused. [Eliot Kimber] | From what I've seen, it appears to be a tool that can fit into a larger | SGML-based strategy that includes a variety of tools, SGML and | non-SGML, appropriate to specific users and tasks. If I may add my two dollars worth. For years I held the same opinion as Mr. Kaikow, but the changing technology and breadth of applications forced me to recant. What Mr. Flynn says is IMO true in one sense: how well the interface matches the user's expectations for the application strongly influences their acceptance of the product. This isn't just an issue of habit; the user interface strongly affects productivity in many ways including the learning curve for the product as well as the ease of entry (repetitive actions, modality of actions, etc.). My real support here goes to Mr. Kimber. I've a lot of faith in Eliot's judgment. If he says Microsoft Word is viable, I accept that. What he adds that is most important IMO is the concept that SGML tools have to be "appropriate to specific users and tasks." While general purpose SGML tools are certainly useful, they tend to also be expensive and difficult to use. This is typically true of most "one size fits all" solutions: snug on some, baggy on others, too tight where it hurts. |-) For newbies who may not understand why I take this position, here is a real world example. First, a WYSIWYG editor is built make it easy to work with what some refer to as the "book metaphor". That is, it represents the printed page or even the book-like display on a screen. *Metaphor* is used here to indicate that a "model" of some real object has been used to design an interface in which the data entry model closely resembles the intended output media. So for element types like this: \ \ and to some extent this \ can productively use an interface which uses the book metaphor. The differences between the two point out something about using a general SGML editor. While in the first case one can reasonably determine with a simple SGML editor what the semantic intent of each generic identifier is, (yes, help with the attributes is needed), in the second case the brevity of the generic identifiers requires a little more knowledge as it has been designed (I'm guessing) to optimized file size for its operating environment. When a more complex editing application is used that either hides the element names or always provides online help for interpreting the semantics, both are equally easy to apply. As both still result, more or less in page-like display, WYSIWYG editors can be used to create them. It's when one encounters an element type like this: \ \ or this \ that special services are required of the user interface metaphor because the book metaphor does not adequately represent them. (Cross my heart, those ARE real world examples. I apologize to Dan Connolly in advance if the expansion of the HTML element type is incorrect. I'm doing this during lunch. |-)) The first is an IDEF model from a re-engineering application. A diagram interface is used with dialog entry objects. The second is from a scripting language for a hypermedia application. The same graphical technique with dialog boxes for data entry works well here. See the Hughes Aircraft AIMMS, the Lockheed Ft. Worth IETM editor, and the KBSI automated modeling tools for examples of this type of interface and representation. In the case of the IETM, the editor is diagrammatic, but the output is an onscreen technical manual that does not resemble a book. In certain cases, while the output format should be available to the author, it is not the best model for the user interface. This is also clearly demonstrated by the Microstar Near & Far DTD editor where the requirements to apply a design methodology in a team of non-SGML literate users has precedence over learning the SGML syntax. Yes, the book metaphor can be used for entering all of these, but it isn't productive on the data entry side and isn't useful or productive to the output side. This is a cautionary tale to those newbies who are trying to build or procure inter-enterprise SGML applications. The strength of SGML is that it can be used to represent many information needs, but all of them cannot be productively authored or presented with a single tool. However, given a well-thought out plan for applying them, SGML allows them to interoperate where needed. ...and that's the beauty of the beast. Len Bullard Newsgroups: comp.text.sgml Date: 12 Oct 1994 17:46:51 UT From: Karen Coyle \ Organization: University of California, Berkeley Message-ID: <37h7oh$kth@agate.berkeley.edu> Subject: SGML to ASCII I've been assigned the un-enviable task of reducing SGML documents to plain ASCII for the users of our information system who are limited to "dumb terminals." This is greatly complicated by the fact that these particular documents are very mathematically oriented. We will inevitably lose some information in this translation, but I would like to hear if there are any accepted standards (true or de facto) for ASCII representation of \ or TeX expressions. -- Thanks, Karen Coyle kec@stubbs.ucop.edu University of California Library Automation Newsgroups: comp.text.sgml Date: 12 Oct 1994 18:04:44 UT From: Chris Biber \ Message-ID: <9410122107.AA2519@notes.microstar.com> References: \ \ \ Subject: Re: Microstar [John Turner] | Has anyone heard of a company called Microstar? Apparently they have a | graphical interface to SGML. I would appreciate any information about | their product. [Christoph Altenhofen] | I suppose you think of Near\&Far, a tool to handle DTDs in a graphical | manner. I saw a demo on the CeBit in Hannover this spring, but didn't | take a closer look at it. In my opinion it's a good tool to visualize | the dependencies and structure of a DTD, but I don't think that | creating a DTD is made so much easier as the traditional way, coding | DTDs by hand. [Peter Flynn] | I have their demo disk. It's pretty good, but it _is_ only a demo, you | can't actually _do_ much with it. But the idea is nice and I think it | might make a good teaching aid. Me, I'll stick to Emacs and | psgml-mode. | | The disk and doc does say that they encourage copying of the demo, so | I'll zip the contents when I get back to the office and put it on my | ftp server: look next week on www.ucc.ie in pub/sgml for nearfar.zip or | something like that. Mail me to hassle me if it doesn't appear :-) I am always surprised to see how little is known about our product, NEAR & FAR. NEAR & FAR is not only a DTD visualizer but is also the only tool for the graphical _creation_, _modification_ and _maintenance_ of DTDs. Being able to visualize the structure of your DTD is in itself of great benefit to document authors and DTD creators. Being able to visually _create_ the DTD brings measurable productivity increases to the process of DTD design. NEAR & FAR's interaction with a central tag and structure library (CADE Groupware) creates a complete document analysis and design environment. The demo disk (nfdemo.zip and nfdemo.txt) can already be found on the net at the following location: ftp://ftp.ifi.uio.no/pub/SGML/Demo/nfdemo.txt and ftp://ftp.ifi.uio.no/pub/SGML/Demo/nfdemo.zip When looking at the demo disk, please keep the following in mind: - It only includes compiled DTD fragments; the full version imports, validates and displays arbitrary DTDs to any level. - validation, reporting, printing and other advanced features are _not_ supported on the demo disk; they are, however, supported in the normal version of NEAR & FAR. I would appreciate any feedback from N\&F users out there - Case studies, problems encountered, suggestions are all welcome. I would also like to hear from those of you who know NEAR & FAR but have decided not to use it. I personally am somewhat baffled that _anybody_ would still exclusively code by hand (oops... that's my marketing genes speaking :-). What tools are you using to document the DTD? -- /\\ /\\ Chris Biber - cbiber@microstar.com \\/ \\/ Computer /\\ /\\ Aided Microstar Software Ltd. Phone: +1 613 727-5696 \\/ \\/ Document 34 Colonnade Road North Fax: +1 613 727-9491 /\\ /\\ Engineering Nepean Ontario Can/US: 1 800 267-9975 \\/ \\/ CANADA K2E 7J6 Info: cade@microstar.com Newsgroups: comp.text.sgml Date: 12 Oct 1994 18:37:14 UT From: Mark Walter \ Message-ID: \ References: \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Mark Walter] | In simple terms, it maps Word document template styles to SGML elements | and attributes, but, in fact, it handles quite a few of the messy | instances where these two document representations do not map on a | one-to-one basis. It covers everything you can do in Word, including | math and tables. [Richard Matzen] | This seems to imply that the formatting info associated with the styles | is output to the SGML file. However, in all of the Microsoft press | releases I havent seen any references to style info being output. So | my questions are: | | Does anyone know if/how this product handles the exchange of style | (formatting) info? Is any style info output? If so, how complete is | it and in what form is it: attributes? Similar to Rainbow DTD? | | Also, same question for WordPerfect's SGML product. Sorry if I implied it exports style info. It does not. Microsoft believes that's what RTF is for :-) I haven't heard of any plans for FOSI, DSSSL, etc. Ditto for WordPerfect, last I heard . . . Mark Walter mwalter@seyboldpub.ziff.com Newsgroups: comp.text.sgml Date: 13 Oct 1994 00:09:57 UT From: "Wayne L. Wohler" \ Organization: IBM Information Development Strategy and Tools Message-ID: <37htslINNogp@afshub.boulder.ibm.com> Keywords: MSOCHAR MSICHAR SGML Subject: function character parsing I've been doing some work to support the parsing of double byte character sets (DBCS), which in IBM terms means mixing of single and double byte character encoding. In EBCDIC, this is accomplished using two escape characters, one to shift into double byte mode and one to shift back to single byte mode. On the face of it, these two escape characters seem a perfect match for the two function characters defined by SGML MSOCHAR and MSICHAR. In my testing using our parser (a version of YASP), I discovered that this works fine for element content and CDATA attribute values but that this solution does not work for some other contexts, specifically PIs, comments and CDATA marked sections. To test this feature yourself, the following lines need to be placed in the SGML declaration in the FUNCTION section of the syntax defn: SO MSOCHAR nn SI MSICHAR mm -- Wayne L. Wohler Internet: wohler@vnet.ibm.com Dept G82/025Z IBMMAIL: USIB29WX@IBMMAIL Information Development Strategy and Tools Phone: +1 303 924 5943 IBM Corporation PO Box 1900 Boulder, Colorado 80301-9191 Newsgroups: comp.text.sgml Date: 13 Oct 1994 02:55:17 UT From: Dexter Medley \ Organization: Express Access Online Communications, Greenbelt, MD USA Message-ID: \ References: <9409117819.AA781919887@cc-mail.agcomed.uiuc.edu> \ Subject: Re: Accessibility and access to the Internet [Darin Duvall] | I've been pretty good about reading all of my newsgroups lately, and I | haven't read anything that would suggest that adopting PDFs as a | standard would have any sort of harmful effect (see messages below). | Did I miss that part? [Arthur Ogawa] | First, many thanks to Darin for posting this note; I hadn't seen | anything else to date. I intend to contact the NIST with my concerns, | and I hope others in this newsgroup do likewise, regardless of which | side of the question you may fall on. With all due respect to Yuri and others in this thread, it's much ado about (probably) nothing. Here's a little more of the straight poop: NIST has convened a so-called "blue ribbon" panel composed of approximately 25 government agency representatives who are involved in electronic document publishing and dissemination to formulate advice to NIST concerning 1) whether or not there should be a FIPS for a page description format, 2) what capabilities that format should provide, and 3) what that format should be. Toward these ends, Adobe has already made a presentation of the capabilities of its PDF; competing portable document technologies will be presented to the panel at an upcoming session (e.g., Replica, Common Ground, and Envoy are included). The panel will then formulate its recommendations. The reasoning of NIST and the panel (a significant number of whom are from agencies already very active in and committed to SGML) is that such a delivery method may be of use to government agencies for relatively small applications and for a relatively short period of time, until SGML, DSSSL, SPDL, etc., further mature commercially and more COTS practical SGML solutions are available, and that there may be benefits to be gleaned from standardizing on one solution. Virtually no one sees page description formats as a competitor to SGML, but rather as a complement or adjunct. Since someone already mentioned Mike Rubinfeld of NIST, who, along with Larry Welsh, is sponsoring this effort, it would be much more efficient for someone to ask them directly about this program before getting this world-wide representative group in such a tizzy... (Usual dis/claimer: I'm speaking for myself here, not the government; and yes, I'm one of those on the panel, so this is first-hand information.) -- Larry Newsgroups: comp.text.sgml Date: 13 Oct 1994 03:54:52 UT From: Jeff Schwartz \ Organization: Information Resources and Technology Message-ID: \ Subject: HoTMetaL Pro for Macintosh out yet? A couple months ago, a SoftQuad rep told me HoTMetaL Pro would be available for the Macintosh this fall. Anyone have any info on this? -- / Jeff Schwartz / http://edu-153.sfsu.edu / / webdog@aol.com / "A friendly little pothole on / / :::::::::::::::: / the information superhighway." / Newsgroups: comp.text.sgml Date: 13 Oct 1994 05:20:04 UT From: Harry Justice \ Organization: ICCS, Inc. Message-ID: <00985E2D.FB9F9AA0.198@tanuki.twics.com> Subject: Public domain full-text I am looking for public domain full-text engines/source code for unix. Does anyone know of any or know where a good place to look would be? Thanks Newsgroups: comp.text.sgml Date: 13 Oct 1994 05:28:53 UT From: Bill Wiest \ Message-ID: \ Subject: What is SGML? Hi, I am pondering the question "What is SGML?" and I have a few questions to ask here relating to that. Would it be fair to say that the reference concrete syntax is *really* more of a generalized markup language definition language than a concrete example of SGML? Is it true that any markup terms defined using the referenc concrete syntax (or any variant concrete syntax) are in fact concrete examples of SGML? Is SGML per se abstract? Thank you for your help. --Bill Wiest (bwiest@suspects.com) Newsgroups: comp.text.sgml Date: 13 Oct 1994 06:41:43 UT From: Gary Benson \ Organization: Fluke Corporation, Everett, WA Message-ID: \ References: <9409067814.AA781434242@njcorp.akbs.com> <373lbt$qsh@doc.cs.nyu.edu> Subject: Re: Anyone have any info about WORD's recent SGML release? [Robert Ducharme] | "Automatic tagging" may mean little more than taking something tagged | with the MS Word style "whatever" and surrounding it with | "\\" in some kind of exported ASCII files. | I have heard that Microsoft employs many OmniMark programmers and uses | SGML extensively for their CD-ROM products. Given this and the | existence of the Word add-on, their lack of presence on comp.text.sgml | is a bit surprising--or, maybe their silence should tell us something. I don't get it, "...should tell us something". Seems to me there are any number of ways to read this: * Microsoft has no in-house expertise, and doesn't want to advertise that fact while they scramble to get on board. * Microsoft has a grand plan about how they are going to lead the SGML parade and want to keep mum until they have a few more ducks lined up. * Microsoft is not yet sure which way the wind is blowing and is just keeping all bases covered until the direction becomes clearer. .. or is what you had in mind really so self-evident and I'm Just Not Getting It (Again) [TM] ?? -- Gary Benson-_-_-_-_-_-_-_-_-_-inc@tc.fluke.com_-_-_-_-_-_-_-_-_-_-_-_-_-_- Inventions reached their limit long ago, and I see no hope for further development. -Julius Frontinus, 1st century AD Newsgroups: comp.text.sgml Date: 13 Oct 1994 06:54:01 UT From: Gary Benson \ Organization: Fluke Corporation, Everett, WA Message-ID: \ References: <9409067814.AA781434242@njcorp.akbs.com> <373lbt$qsh@doc.cs.nyu.edu> \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Howard Kaikow] | Having now reviewed the Fact Sheet that describes the MS plans for SGML | Author for Word, it would appear that Word would be able to | create/input an SGML file. However, the formatting will still be | restricted by the oversights in Word, e.g., to use my favorite example, | you still will not be able to use dictionary style definitions and do | automagic numbering of clauses and the table of contents. Hard coding | is just unacceptable. | | It is unclear as to what restrictions will be placed on the | specification of a DTD. Yes. My boss read the sheet, picked up on the line about "all the capabilities of Word", and asked, "Is that right? Can SGML do running heads?" My answer (as usual) was, "Of course. It can do hypertext links and index hits and tables of contents, too". He pursued the matter, "But running heads are a page-oriented feature. I thought SGML was about data structures, how can I tell it about page things like footer fonts and other page fixtures?" I admit I did not have a good answer. What did I miss the opportunity to tell him this time? -- Gary Benson-_-_-_-_-_-_-_-_-_-inc@tc.fluke.com_-_-_-_-_-_-_-_-_-_-_-_-_-_- Inventions reached their limit long ago, and I see no hope for further development. -Julius Frontinus, 1st century AD Newsgroups: comp.text.sgml Date: 13 Oct 1994 13:42:49 UT From: Paul Constantine \ Organization: Yale University Message-ID: \ Subject: SGML '94 Could someone either post (re-post?) or e-mail me information regarding SGML '94? Thanks! Paul J. Constantine Electronic Text Center Yale University Library Paul.Constantine@Yale.Edu Newsgroups: comp.text.sgml Date: 13 Oct 1994 13:48:16 UT From: Seong-Joon Yoo \ Organization: Syracuse University CIS Dept. Message-ID: <37jdr0$ea5@newstand.syr.edu> Keywords: SGML user's group Subject: How can I join the SGML user's group? Is there any site that I can send email to join the SGML user's group, especially the SGML database SIG? Please give me any pointers. Thank you. -- Seong-Joon Yoo sjyoo@top.cis.syr.edu Ph.D. Student CIS Syracuse University Syracuse, NY 13244 +1 315 443 1073 Newsgroups: comp.text.sgml Date: 13 Oct 1994 15:33:35 UT From: "W. Eliot Kimber" \ Organization: Passage Systems, Inc. Message-ID: \ References: <9409067814.AA781434242@njcorp.akbs.com> <373lbt$qsh@doc.cs.nyu.edu> \ \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Gary Benson] | Yes. My boss read the sheet, picked up on the line about "all the | capabilities of Word", and asked, "Is that right? Can SGML do running | heads?" | | My answer (as usual) was, "Of course. It can do hypertext links and | index hits and tables of contents, too". He pursued the matter, "But | running heads are a page-oriented feature. I thought SGML was about | data structures, how can I tell it about page things like footer fonts | and other page fixtures?" I admit I did not have a good answer. | | What did I miss the opportunity to tell him this time? The problem is the word "do" as in "It can *do* hypertext". Unfortunately, SGML doesn't *do* anything -- it's merely a data representation system. It's the processors that act on SGML data that do these things. Therefore, you can get running heads from an SGML version of Word because Word does running heads. As long as you can tell it from what part of the SGML-represented data to get the running head information it needs, Word is happy. Ditto for TOCs or index entries or any of the other things that would normally be generated from the core data elements (e.g., section titles get used as the running head text and as the text for TOC entries). This sort of conversation shows the damaging effect that word processors and DTP systems have had: they've caused people to forget that much of what goes on a page or on a screen is transitory and generated, not an intrinsic part of the base information. Word processors (more than DTPs), because they deal purely with the page representation of the information, force us to think about and author explicitly all "page objects", with the true information objects implied by the page objects. SGML-based systems are the reverse: they force us to think explicitly about the information objects and give us the page objects as part of their value add (I almost said "for free", but that's not quite true, although it's almost true). I have worked with structured information for many years, first using line editors like Xedit and VI, then using SGML editors. I always considered myself close to optimally efficient with those systems. Lately I've been forced to use DTP and word processing systems. Let me tell you that these systems are *not* easier to use. It is very much harder to do many things in DTP systems than it is to do them in SGML systems. What is easier to do is to get a particular page image. But consider the problem that Frame and Word both present when you want to represent a simple document type with elements that can occur in many contexts. Because styles in Word and Frame have little or no understanding of their surrounding context, it means that where in SGML I might have the element types UL, OL, and LI, representing nestable ordered and unordered lists, in Frame I'll need the following components to represent just two levels of nesting: UL1item-first UL1item UL1item-last OL1item-first OL1item OL1item-last UL2item-first UL2item UL2item-last OL2item-first OL2item OL2item-last Plus these paragraph components to allow paragraphs within list items: LIPara1 LIPara2 Clearly there is a combinatorial explosion as your documents become more complex and as you try to capture more semantic distinctions with the style names. Compare that with the equivalent SGML element declarations: \ \ \ Using Frame or Word, it is very easy for a user to get the nesting list structure wrong: they'll forget a -first or -last item, they'll put a OL2 before an OL1, etc., With an SGML editor, they can't get it wrong. The problem with SGML editors, IMHO, is not that SGML is hard to work with, but that the user interfaces have not reached the level of smoothness that word processors and DTP systems have achieved. This EUI quality is independent of the actual features the systems provide. For example, I like Frame's user interface a lot -- I find it pretty smooth and well optimized for the tasks, even though the tasks themselves are often much more tedious than they would be if I was using an SGML editor. If Author/Editor or Arbortext had the same EUI qualities as Frame or Word, they'd be hands-down killer apps. This is why I think both the Word offering and FrameBuilder have real potential: they have good user interfaces. At this point, FrameBuilder lacks only true native SGML support and a little better protection of the markers that represent SGML-specific constructs like attributes. But for a Frame user, moving to FrameBuilder will probably be to their advantage because the presence of structure makes Frame *even easier to use*. Even without SGML import/export out of the box, FrameBuilder helps because it makes Frame documents easier to create and makes the Frame data itself more reliable and consistent. The SGML Author product should have the same properties and benefits. -- \
W. Eliot Kimber (kimber@passage.com) Systems Analyst and HyTime Consultant Passage Systems, Inc., 9971 Quail Blvd., Suite 903, Austin TX 78758 +1 512 339 1400 465 Fairchild Dr., Suite 201, Mountain View, CA 94043, +1 415 390 0911 \
Newsgroups: comp.text.sgml Date: 13 Oct 1994 18:10:20 UT From: Erik Naggum \ Organization: Naggum Software; +47 2295 0313 Message-ID: <19941013T181021Z.enag@ifi.uio.no> References: <37htslINNogp@afshub.boulder.ibm.com> Subject: Re: function character parsing [Wayne L. Wohler] | In my testing using our parser (a version of YASP), I discovered that | this works fine for element content and CDATA attribute values but that | this solution does not work for some other contexts, specifically PIs, | comments and CDATA marked sections. this is certainly not what one expects. I surmise that this interpretation is caused by very literal reading of a few requirements in the standard out of context. re processing instructions: "no markup is recognized in system data other than the delimiter that would terminate it." [341:4] re comments: "no markup is recognized in a {comment}, other than the /com/ delimiter that terminates it." [391:12] there is a note to a similar effect for marked sections. \ but this is not all. MSICHAR, MSOCHAR, and MSSCHAR are not markup, but function characters, so are not "recognized" or "not recognized" to have their effect. from 6.2 SGML Entities: "If an {SGML character} is a {function character}, its function is performed in addition to any other treatment it may receive." [297:20] #\ -- Microsoft is not the answer. Microsoft is the question. NO is the answer. Newsgroups: comp.text.sgml Date: 13 Oct 1994 18:20:18 UT From: Ed Holst \ Organization: Lockheed Aeronautical Systems Company Message-ID: <1994Oct13.182018.11056@enterprise.rdd.lmsc.lockheed.com> References: <9409117818.AA781866386@njcorp.akbs.com> Subject: Re: Change of employer & contact info for Chery Do I know this lady from a past life? Welcome to the happy world of Interneting (sp?). -- PEACE--THROUGH SUPERIOR FIREPOWER THE OPINIONS EXPRESSED HEREIN ARE NOT NECESSARILY MY OWN!! SOMETIMES I JUST CAN'T HELP MYSELF!! eholst@grunt.lasc.lockheed.com Newsgroups: comp.text.sgml Date: 13 Oct 1994 18:46:55 UT From: "Liam R. E. Quin" \ Organization: SoftQuad Inc., Toronto, Canada Message-ID: <1994Oct13.184655.26716@sq.sq.com> References: <36bvg6$n9e@kaa.heidelbg.ibm.com> \ <1994Sep30.024603.29919@midway.uchicago.edu> Subject: Re: Multilingual HTML, SGML documents? [Peter Kerr] | ISO 10646 compliance could be a systematic way to ensure that | authoring/browsing systems work properly. [Richard L. Goerwitz] | I don't think ISO 10646 is the be all and end all of encoding | standards. There will always be simple 8-bit systems, and there will | probably be many many 16-bit, 32-bit, and perhaps still other systems. This is silly (sorry). ISO 10646 does not say anything about the width of an integer. You could implement ISO 10646 on a Commodore PET if you wanted, were silly enough, and had enough time. What (Unicode + 10646) says is, when you see this sequence of input bytes, this is the character that has been represented. when you see this set of characters in this sequence, produce this combination of glyphs. | In fact, what we'll need - for maximum flexibility - is a language at- | tribute *and* an encoding attribute. Both will need to be embedded in | the document. Until recently I would have agreed. I have since understood more about Unicode and ISO 10646. I _do_ agree that you need a language tag. You probably also want a DefaultScript attribute, since you can write (for example) Vietnamese text in a variety of texts, or, if you prefer, you can present Greek or Devanagari (whether Sanskrit or Hindi) in a roman transliteration. It might be possible to put the burden of translating from (say) UTF-1 into ISO 88951-2 in the server, as part of an HTTP negotiation. That wouldn't help mixed-script documents, though. Note that if you want to give every glyph a distinct character, and don't have floating accents, you need more than 256 characters to represent polytonic (Ancient) Greek, because there are more than 256 glyph combinations [see note 1 for examples]. I am using Greek as an example partly because there are many classicists and bible scholars [2] on the net, and partly because it makes some of the issues clearer to people intimidated by large Asian character sets! Why am I saying all this? :-) 8-bit systems cope with this today, usually with floating accents, but also by mixing fonts on the screen, e.g., with escape sequences. An `8-bit-character-set mentality' (I know you were not espousing this) causes considerable problems, not just for multi-lingual documents. A richer system that maps into 8-bit display codes with multiple glyph sets is entirely possible. Don't rule out Unicode or ISO 10646 because you don't have a 32-bit computer! Lee Notes [1] The main Greek accented combinations are \\a /a \\`a \\'a /`a /'a \\~a /~a `a 'a ~a where a can be a, ai, e, h, i, o, u, w, ei; the `ai' indicates an alpha with a tiny iota (iota subscript) under it. \\ and / are lenis and asper (rough and smooth breathing), ' is acute accent, ` is grave accent, ~ is tilde/long/circumflex; in addition, rho can take \\ and / (as in `rhythm'), and in some schemes there can be a dot over or under any letter independently of other accents, for example to mark inferred of corrupted text. You can experiment with the shareware Wingreek package on the PC, which also has limited support for bidirectional Hebrew with vowel marks. [2] New Testament Greek can be done with 8 bits, because it's OK to omit many of the accents. However, at least some printings of the New Testament use the full accents, which is a big hlp to beginners at learning the language (as opposed to the script) like me :-) -- Liam Quin, Manager of Contracting, SoftQuad Inc +1 416 239 4801 lee@sq.com HexSweeper NeWS game;OPEN LOOK+XView+mf-fonts FAQs;lq-text unix text retrieval SoftQuad HoTMetaL: ftp.ncsa.uiuc.edu:Web/html/hotmetal, and also doc.ic.ac.uk: packages/WWW/ncsa/..., gatekeeper.dec.com:net/infosys/Mosaic/contrib/SoftQuad/ Newsgroups: comp.text.sgml Date: 13 Oct 1994 20:07:57 UT From: Simon North \ Reply-To: north@direct.iaf.nl (Simon North) Message-ID: <53@direct.iaf.nl> Subject: Why SGML and NOT Acrobat I, too, would like to throw in my 2 cents to the discussion. I am not an American citizen and didn't (may not) vote for Clinton (or her husband) and so I have little chance to directly influence what happens on your side of the "pond". However, ... America is a member of NATO, and the NATO countries very often follow the American example. As an illustration, I have (draft) copies of JSPs (British "Joint Service Publications" - closes equivalent to US DoD and Mil standards) for IETMs. 90% of the text is copied directly from the three US Mil standards. So, certainly from the defense point of view, whatever "you lot" get up to is extremely important to us in Europe. Please remember that when you go lobbying. More to the point, when the company I work for was taken over by another (foreign) country we were asked/instructed to investigate what the cost would be to convert to using the same DTP/WP software as them. Without going into the gory details, the cost was FRIGHTENING. Even if the company was prepared to spend the money, the resources are simply not available (time & manpower mainly). Filtering helps a little but even with these packages (believe it or not, FrameMaker and Interleaf), a lot of manual intervention is still required and what is far, far worse, the results are unpredictable. Even within our software packages, migrating from version to version NEVER goes without problems ... and, as pure customers of these products we are at the "blunt end", often left with no choice but to accept what the manufacturer thinks we want, or knows what is the best consensus of the total market requirements. The issue is not just accessibility. If I were that bothered I can either distribute a run-time viewer with the documentation, or I could simply use something like (Display) PostScript (also an Adobe product). I am prepared to grant the point that our interests and needs are very specific (longevity of the data must be in the region of 15+ years), but THE reason that SGML is so attractive is not that it is platform independent - far more importantly it is VENDOR independent - or at least, it has the potential to be so (pause for a sigh in the direction of Microsoft, Frame Technologies and Interleaf ... boy, you guys are making it soooooo hard with your not-quite-SGML applications that are giving me even more grey hairs working out out how to get SGML into your products and then get it out again). SGML is not THE answer, yet, (far too document and sequentially oriented), HyTime gets closer (I'm still waiting for the first real application software), maybe SGML2 will be THE answer. Whatever, the important thing is that SGML is an ISO standard which, IMHO, makes it OUR property NOT that of some software company ... Needless to say, this is MY e-mail account and also my opinion. -- Simon North, Technical Author/Translator, SGML & Internet Consultant Private (TCP/IP): Directions (uucp): Daytime job: north@knoware.nl north@direct.iaf.nl db512@hgl.signaal.nl Newsgroups: comp.text.sgml Date: 14 Oct 1994 02:23:12 UT From: Evan Rudowski \ Organization: The Pipeline Message-ID: <37kq2g$ivv@pipe2.pipeline.com> Subject: Re: CALL FOR SGML NEWS ITEMS for SGML '94 [Yuri Rubinsky] | C A L L F O R B I T S | _________________________ | | By now a five-year tradition, the SGML Year in Review continues to open | both the North American and European SGML series of conferences. | | As always, it attempts to represent exciting new projects and | initiatives as well as provide updates on on-going work of interest to | the conference attendees and to the readers of this newsgroup and the | half-dozen newsletters that now reprint the presentation. | | SGML '94 is fast approaching (Nov 6 tutorials; Nov 7-10 conference; Nov | 10-11 SGML Open events), and with it, the Year in Review Call for | Contributions. How can I get details about SGML '94? Thanks. -- Evan Rudowski Newsday Electronic Publishing EVANRUD@PIPELINE.COM Newsgroups: comp.text.sgml Date: 14 Oct 1994 03:51:38 UT From: Tim Arnold-Moore \ Organization: CITRI Message-ID: <37kv8a$6p3@etrog.citri.edu.au> References: <374l6q$fuv@usenet.rpi.edu> <3757ki$4e5@scapa.cs.ualberta.ca> Subject: Re: Storage Models for SGML data [Chiradeep Vittal] | Documents are complex entities (hierarchical, as you point out). An | RDBMS is hardly the ideal candidate to model this complexity. An | object oriented DBMS would be the ideal candidate because it can do all | the 3 things you have listed above. There is a paper around describing exactly such a system: V. Christophides, S. Abiteboul, S. Cluet and M. Scholl "From Structured Documents to Novel Query Facilities" in Proceedings of the ACM SIGMOD Conference 1994, Minneapolis, MN, pp313-324. However I am not convinced that this approach is actually feasible for any reasonable sized collection. OODBMS's are just too slow and you end up having to do an awful lot of join operations. The work at INRIA is interesting: G.E. Blake, M. P. Consens, P. Kilpelainen, P.A. Larson, T. Snider and F.W. Tompa "Text/Relational Database Management Systems: Harmonising SQL and SGML" in Proc. Int. Conf. on Applications of Databases, 1994, Vadstena, Sweden, pp 267-280. as it is based on PAT trees, a novel indexing/data modelling approach. Here at CITRI we have also done some work criticising both of these approaches, preferring an element-based text-specific model over a mapping to another paradigm, as this brings the issue of efficiency to the for in designing the model as well as implementing it. Because we start with a text-oriented model, we can take advantage of all the work in efficient access that has taken place in this field. A tech-report of our work is available and I am quite happy to send a postscript or dvi file to anyone who wants it. | I'm currently implementing an SGML/HyTime database on a commercial OO | DBMS. A tech report should be available within a couple of weeks. Our | target application is multimedia news, and we assume zero updates to | the document, so our storage model may not be entirely appropriate for | your needs. Might work for small documents like newspaper articles and smallish collections but I bet you have serious performance problems with gigabyte size databases and large documents. -- Tim Arnold-Moore | CITRI, RMIT | Uni. of Melbourne Law School tja@kbs.citri.edu.au | 723 Swanston St | ---------------------------- Phone: +61 3 282 2487 | Carlton 3053 | simul iustus Fax: +61 3 282 2490 | Victoria, Australia | et peccator Newsgroups: comp.text.sgml Date: 14 Oct 1994 05:15:27 UT From: "Richard L. Goerwitz" \ Organization: University of Chicago Message-ID: <1994Oct14.051527.25155@midway.uchicago.edu> References: \ <1994Sep30.024603.29919@midway.uchicago.edu> <1994Oct13.184655.26716@sq.sq.com> Subject: Re: Multilingual HTML, SGML documents? [Richard L. Goerwitz] | I don't think ISO 10646 is the be all and end all of encoding | standards. [Liam R. E. Quin] | This is silly (sorry). ISO 10646 does not say anything about the width | of an integer. You could implement ISO 10646 on a Commodore PET if you | wanted, were silly enough, and had enough time. It's only silly because you misunderstood. The process of encoding a writing system is much more complex than a simple mapping of glyphs onto 8-, 16- or 32-bit numbers (or bit-fields). Many issues will go on being debated for a long, long time. ISO 10646 will never satisfy everyone, and it makes sense to allow latitude for alternate encoding schemes. | Don't rule out Unicode or ISO 10646 because you don't have a 32-bit | computer! Again, this has nothing to do with what I was saying - although it's a perfectly valid point to make here. | [2] New Testament greek can be done with 8 bits, because it's OK to | omit many of the accents. However, at least some printings of the | New Testament use the full accents, which is a big hlp to beginners | at learing the language (as opposed to the script) like me :-) Okay, now try fully cantillated Hebrew, or Quranic Arabic, or (better yet) Akkadian cuneiform. -- -Richard L. Goerwitz goer%midway@uchicago.bitnet goer@midway.uchicago.edu rutgers!oddjob!ellis!goer Newsgroups: comp.text.sgml Date: 14 Oct 1994 08:38:55 UT From: Johan Henrikson \ Message-ID: <9410140838.AA04551@lars.texcel.no> Subject: SGML and the EEC I'm looking for more information regarding SGML and the EEC. Rumors have it that the EEC is increasing focus on ODA (!) and might be dropping SGML from their recommendation. Despite the absurdity of this rumor I would like to check out the current status of SGML in the EEC. If anyone has any suggestions pertaining to relevent documents, government agencies, etc to contact please get in touch with me. Thank you in advance, Johan Henrikson -- Johan Henrikson E-mail: johan@texcel.no Texcel AB Adress: Storsaetragraend 12, 127 39 Skaerholmen, Sweden Newsgroups: comp.text.frame,comp.text.sgml Date: 14 Oct 1994 09:37:10 UT From: Wiedmer Hans Ulrich \ Organization: Inst. for Machine Tools and Manufacturing, ETHZ Message-ID: <37ljg6$26l@elna.ethz.ch> Subject: SGML-related Frame Products: no university licensing?! Dear all, I am told by a local Frame distributor that there is no university licensing available. This holds for every Frame Product related to SGML (i.e., FrameBuilder, SGML Toolkit, SGML Utilities etc.). It sounds like this is just the policy of Frame Technology Inc. DO OTHER UNIVERSITIES EXPERIENCE THE SAME SITUATION ?????? - if yes: is there an understandable reason for Frame Technology to do so? - if no: could anybody give me an address of a Frame distributor in the "neighbourhood" (whatever that may mean, let's say Europe). My apologies to those who see this twice (you're the right people to address! but there is no possibility to address only the intersection of the two newsgroups). -- Kind regards John Wiedmer Swiss Federal Institute of Technology (ETH) Zurich Newsgroups: comp.text.sgml Date: 14 Oct 1994 12:06:04 UT From: Kent Summers \ Organization: Electronic Book Technologies, Inc. Message-ID: <37ls7c$917@spock.ebt.com> References: <374cqn$3l8@bmw.hwcae.az.Honeywell.COM> <3762vk$irc@deep.rsoft.bc.ca> <37c359$j04@bmw.hwcae.az.Honeywell.COM> Subject: Re: SGML "Viewers" [Kevin Brennan] | Temporary Revision capability allows the using orginazation to provide | revisions to the text contained on the CDROM without cutting a new | CDROM. For example, an airline using maintenance manuals on CDROM may | receive revisions to the manual *between* issuances of new CDROMS. | Jouve's software puts a revision file on the local hard drive and | replaces or annotates text appropriately. In the case of a networked | resource, there can be a shared file on the network. There of course, | is a seperate TR software (optional at some cost) that the Maintenance | Publication department would have to acquire. I'd like to point out that before you can really deliver incremental updates to a browser (local or remote) you really need to be able to detect and track deltas in the context of a sophisticated SGML version control/configuration management system. This is exactly what dynabase is designed to do: provide the electronic equivalent of "printed change pages," but at a much lower level of granularity --> an SGML-based, fine-grain revision control environment. Since EBT makes both dynabase and dynatext, one could expect that we would be making this incremental update capability an intergral part of our products in the future. CD-ROM? well, we've heard a lot of them over the past year; and today, it's internet this, internet that. are you starting to get the message? do it once, and send it out your fovorite way. SGML is the key. EBT is the enabler. stay tuned. Newsgroups: comp.text.sgml Date: 14 Oct 1994 12:36:13 UT From: Yuri Rubinsky \ Organization: SoftQuad Inc., Toronto, Canada Message-ID: <1994Oct14.123613.21924@sq.sq.com> References: <9409117819.AA781919887@cc-mail.agcomed.uiuc.edu> \ \ Keywords: nist, fips, braille, document, icadd Summary: Call it a FIPS for a portable page display format Subject: Re: Accessibility and access to the Internet [Dexter Medley (Larry)] | With all due respect to Yuri and others in this thread, it's much ado | about (probably) nothing. Here's a little more of the straight poop: | | NIST has convened a so-called "blue ribbon" panel composed of | approximately 25 government agency representatives who are involved in | electronic document publishing and dissemination to formulate advice to | NIST concerning 1) whether or not there should be a FIPS for a page | description format, 2) what capabilities that format should provide, | and 3) what that format should be. Toward these ends, Adobe has | already made a presentation of the capabilities of its PDF; competing | portable document technologies will be presented to the panel at an | upcoming session (e.g., Replica, Common Ground, and Envoy are | included). The panel will then formulate its recommendations. (Just out of curiosity, and before I get going here, do I understand this correctly, that PDF got a whole day on its own, and all possible alternatives get lumped into just one day? Just curious.) Larry's first paragraph touches on my main area of concern, and it involves defining the task and therefore defining the scope of the proposed US Federal Information Processing Standard. Larry calls the FIPS "a page description format" which I'm completely comfortable with, but also calls it a "portable document technology". I have no problem with page description format, or page display format, or even portable page display format. My concern is with the presentation of a format geared specifically to page description *as if it were* a "document format". For the purposes of this discussion, I have only one constituency -- and it's not the SGML vendors: I work on the International Committee for Accessible Document Design, the committee working on the accessibility of *documents* by the visually impaired. Note that "documents" in that sentence is used very particulary to imply that there is a body of information pulled together from whatever sources, in whatever fashion, waiting, so to speak, to be presented appropriately to its user/reader. I believe that that's the only fair use of the word document, in fact, or else people who can't see the paper, or who can't make out the words or letters on the paper, are completely disenfranchised from the world of documents. Certainly we also get to call those old-fashioned bits of squished pulp with type on them documents too, but a standard that purports to deal with "documents", I think, *must* use the broader definition, the definition that includes the ICADD constituency. Or else, as far as I understand US law, they're in fact breaking it. The Americans with Disabilities Act is intended to prevent government from creating non-accessible information. (I've not read that law; just basing this observation on rumours about that law.) | The reasoning of NIST and the panel (a significant number of whom are | from agencies already very active in and committed to SGML) is that | such a delivery method may be of use to government agencies for | relatively small applications and for a relatively short period of | time, until SGML, DSSSL, SPDL, etc., further mature commercially and | more COTS practical SGML solutions are available, and that there may be | benefits to be gleaned from standardizing on one solution. Virtually | no one sees page description formats as a competitor to SGML, but | rather as a complement or adjunct. This is admirable, and I trust that the FIPS will include a discussion of its temporary status. Adobe contributed heavily to the development of SPDL, after all. Again, note the use by Dexter/Larry here of "page description formats", a completely acceptable term. I hope the FIPS uses the same language. | Since someone already mentioned Mike Rubinfeld of NIST, who, along with | Larry Welsh, is sponsoring this effort, it would be much more efficient | for someone to ask them directly about this program before getting this | world-wide representative group in such a tizzy... I stand by my remarks. If there is a tizzy it's because it's worth while there being a tizzy. If the FIPS -- even by virtue of its name -- pretends to be more than it is, I will continue being tizzified. | (Usual dis/claimer: I'm speaking for myself here, not the government; | and yes, I'm one of those on the panel, so this is first-hand | information.) | | -- Larry I appreciate that we have first hand information. I take it that the flip side of that is that we have someone listening who is in a position to make good on the context and role described in his posting -- and will do his best to ensure that the 7% of Americans who have a print impairment are not disenfranchised by a display format that (assuming it's stock PDF): a) cannot, as far as I know, give them the navigation they need since it's lost all the hierarchy that sighted people take for granted. It is foolish to get a screen reader to read you converted PDF pages, in sequence. If you're sighted, how would you like to use a cassette tape recorder to look up something in an encyclopaedia? For sensible navigation you need something that understands the value of presenting the contents as a navigable tree structure, something like IBM Book Manager, Lector, Olias, DynaText, Pathways, SoftQuad Explorer, etc. And the information for that navigable structure *must* come from somewhere, for example, it could simply be implied by the SGML markup. b) cannot, as far as I know, give people who need large print the opportunity to set a default minimum size for all type -- that is they must use PDF as if it holds a printed page: If they want to make a 6point footnote legible as a 24point display, they're also, of necessity, turning the 12point base text into outrageously large 48point. c) cannot -- despite what was said in a previous posting -- be readily turned into Braille because it has already lost the markup needed to generate the blanks, indents and special characters that are the only actual formatting capabilities of Braille. (When we see an SGML-enhanced PDF I'm willing to grant this one.) (In fact, probably all my concerns would go away under that circumstance.) I would be very happy to discover I'm wrong about any of these assumptions. I'll retract anything anytime. But nothing I've seen from the postings in this newsgroup or many phone conversations on this subject convince me otherwise. Don't get me wrong: I'm a fan of Adobe and what it's done in the world. I think PDF solves or partially solves a large set of problems that need solving. I'm only concerned about it purporting to solve *all* the problems -- even ones it wasn't designed to solve. And it all stems from calling itself a "portable document format" as if documents are only for people who can see. Yuri Rubinsky Newsgroups: comp.text.sgml Date: 14 Oct 1994 14:14:20 UT From: Erik Ableson \ Organization: Achilles Networking Inc, Ottawa, ON Message-ID: <37m3ns$j7c@hermes.achilles.net> Subject: DTD2HTML I recently ran across a msg containing a reference to a utility called DTD2HTML, designed to simplify the process of converting SGML data into a format suitable for use with a WWW server. If anyone can point me to a site that has this software (or the mfg, if it's commercial) it would be appreciated. -- Erik Ableson Ableson Consulting eableson@zeus.achilles.net P.O. Box 1641, Stn 'B' +1 613 762 5299 Hull, PQ J8X 3Y5 Newsgroups: comp.text.sgml Date: 14 Oct 1994 14:48:06 UT From: Peter Flynn \ Organization: University College Cork Message-ID: \ References: <37m3ns$j7c@hermes.achilles.net> Subject: Re: DTD2HTML [Erik Ableson] | I recently ran across a msg containing a reference to a utility called | DTD2HTML, designed to simplify the process of converting SGML data into | a format suitable for use with a WWW server. No, this just reformats and displays a DTD in HTML form, not an instance, as far as I know (but I'd love to be proved wrong :-) ///Peter -- Peter Flynn | pflynn@curia.ucc.ie | ...persuade users that spreading Computer Centre | +353 21 276871 x2609 | fonts across the page like peanut University College | +353 21 277194 (fax) | butter across hot toast is not the Cork, Ireland | Opinions are my own. | route to typographic excellence... Newsgroups: comp.text.sgml Date: 14 Oct 1994 14:53:18 UT From: Rick Jelliffe \ Message-ID: \ References: \ <374tum$rff@hopper.acm.org> Subject: Re: response to miniSGML [Dave Peterson] | ISO 8879 certainly encourages this "marker in the text" point of view. | While ordinary elements have their content bracketed by a pair of tags, | even if that content happens to be empty, EMPTY elements do not have a | pair of tags, just one tag that says "here goes something special". Goldfarb writes that "An element has a start-tag and and end-tag, but there is a situation in which one of these might not be there... Technically, the content always exists, even if it is empty and looks as if it isn't there." (p307: bad pencil means maybe bad transcription) His comment on 7.3 is also clarified in the Attachment 1 printed in the Handbook. For example, if there is an explicit content reference in a start tag, then (even though an end-tag is not allowed in the markup) the data for that explicit content reference (which the application provides) is treated as if nested between an start tag and end tag. (It is not omittag minimization, because that is discretionary: this is enforced.) But I think we have fallen into the trap of trying to make the standard mean only one clear and definite thing. I resign immediately (-: -ricko -- Rick Jelliffe ricko@allette.com.au Newsgroups: comp.text.sgml Date: 14 Oct 1994 15:55:25 UT From: Dave Peterson \ Organization: ACM Network Services Message-ID: <37m9ld$e50@hopper.acm.org> References: \ <374tum$rff@hopper.acm.org> \ Subject: Re: response to miniSGML [Dave Peterson] | ISO 8879 certainly encourages this "marker in the text" point of view. | While ordinary elements have their content bracketed by a pair of tags, | even if that content happens to be empty, EMPTY elements do not have a | pair of tags, just one tag that says "here goes something special". [Rick Jelliffe] | Goldfarb writes that "An element has a start-tag and and end-tag, but | there is a situation in which one of these might not be there... | Technically, the content always exists, even if it is empty and looks | as if it isn't there." (p307: bad pencil means maybe bad | transcription) | | His comment on 7.3 is also clarified in the Attachment 1 printed in the | Handbook. Goldfarb's interpretation is his own, as is mine. | But I think we have fallen into the trap of trying to make the standard | mean only one clear and definite thing. I resign immediately (-: I sincerely hope that SGML can evolve to interpretations not envisioned by its creators, without destroying the original interpretations. Such evolution is not uncommon. Else why is the mathematical study of periodic phenomena called "trigonometry"? Dave Peterson SGMLWorks! davep@acm.org Newsgroups: comp.infosystems.www.providers,comp.text.sgml Date: 14 Oct 1994 16:37:30 UT From: Brooks Cutter \ Organization: AT\&T Paradyne Message-ID: <37mc4a$me8@pdn.paradyne.com> Subject: SGML/HTML Style Sheets? A while ago, I read about a X11 WWW browser (Viola?) that was described as having a "nifty" feature of supporting SGML Style sheets... I know little about formal SGML, but I presume that a Style sheet allows one to define the display layout of a SGML document? Can someone please point me towards more information on this? I've looked around the web (including w3catalog), and haven't found any info on this... I've been developing a perl library for parsing and displaying HTML (for a browser I'm working on), and would like to make the parser/display library as generic as possible, and support style sheets... thanks for any info you can provide.. -Brooks bcutter@paradyne.att.com -- Brooks Cutter (I: bcutter@paradyne.com) (U: ...uunet!pdn!bcutter) Unix System Administrator, Engineering Services AT\&T Paradyne, Largo Florida (Opinions expressed are my own and do not reflect those of my employer) Newsgroups: comp.text.sgml Date: 14 Oct 1994 17:23:17 UT From: Dave Petersen \ Organization: ACM Network Services Message-ID: <37meq5$ef4@hopper.acm.org> References: \ Subject: Re: What is SGML? [Bill Wiest] | I am pondering the question "What is SGML?" and I have a few questions | to ask here relating to that. | | Would it be fair to say that the reference concrete syntax is *really* | more of a generalized markup language definition language than a | concrete example of SGML? The syntax "productions" that define the language of SGML start by defining "SGML document" in terms of other things (called by comp sci language specialists "non-terminals"), which in turn are defined in terms of yet other things,...,ultimately in terms of characters. Just before the last step, each special character string used for SGML punctuation is given a name. Then the last step maps these names onto specific character strings. For example, the opening character string for an end-tag is named "etago" (for end-tag open), and is "usually" mapped to ", if you have access to that newsletter. (I wrote the article.) | Is it true that any markup terms defined using the referenc concrete | syntax (or any variant concrete syntax) are in fact concrete examples | of SGML? Presumably a "concrete example of SGML" is an SGML document. Presumably also a "markup term" is a non-terminal such as "start-tag" or "stago" -- or "SGML document" -- as defined in one of the productions of ISO 8879. If this is what you mean by each of these phrases, then the answer is "no". Just as "book" (the abstract class) is not a book, "SGML document" is not an SGML document. But I suspect this distinction is something most of us would leave to the philosophers. | Is SGML per se abstract? Unless you have some technical meaning in mind for "abstract", I think this one too gets passed to the philosophy section. Did I guess your meanings correctly? Hope I helped. Dave Peterson SGMLWorks! davep@acm.org Newsgroups: comp.text.sgml Date: 14 Oct 1994 18:46:28 UT From: Joel Finkle \ Organization: Searle R\&D Message-ID: <1994Oct14.184628.21488@tin.monsanto.com> References: <53@direct.iaf.nl> Subject: Re: Why SGML and NOT Acrobat [Simon North] | SGML is not THE answer, yet, (far too document and sequentially | oriented), HyTime gets closer (I'm still waiting for the first real | application software), maybe SGML2 will be THE answer. Whatever, the | important thing is that SGML is an ISO standard which, IMHO, makes it | OUR property NOT that of some software company ... Personally, I'd like to see a hybrid file format that can serve both masters: 1) A page-turner application that can display the full appearance of the original paper document (that regulations still require) and 2) A structured document editor that allows me to search and extract the portions relevant to me Is seems to me than an Acrobat-like file format could be extended to contain the SGML-style tags, that are not (necessarily) visible to the page-turner, but extractable in original SGML form. Displaying an SGML doc on the fly is slow, especially when you want to see something in the middle. Having just a page-turner limits the content-based retrieval. A cross-breed is what I'd like to see. -- Joel Finkle Searle R\&D jjfink@skcla.monsanto.com "And when I die don't bury me / in a box in a cemetery. Out in the garden would be much better, I could be pushin' up home grown tomatoes" -- Guy Clark, "Home Grown Tomatoes" Newsgroups: comp.text.sgml Date: 14 Oct 1994 19:23:35 UT From: Simon North \ Message-ID: <57@direct.iaf.nl> Subject: DynaText will not print. Any ideas? Under UNIX, the standard DynaText browser is unable to print bitmap illustrations from a separate window. Using our customised version of the user interface (built with the Developers' Toolkit), we can't get DynaText to print _any_ graphics at all from the main text window. Under MS-Windows, the standard interface works fine (after some juggling with printer drivers) but the same custom interface is unable to print vector graphics from the main text window. Has anyone else been experiencing similar (or even different) problems? Desperate minds would like to know. -- Simon North, Technical Author/Translator, SGML & Internet Consultant Private (TCP/IP): Directions (uucp): Daytime job: north@knoware.nl north@direct.iaf.nl db512@hgl.signaal.nl Newsgroups: ba.seminars,alt.hypertext,comp.text,comp.text.sgml,comp.multimedia,comp.unix.misc,comp.graphics Followup-To: poster Date: 14 Oct 1994 20:21:02 UT From: Paul Fronberg \ Organization: Sun Microsystems Inc., Mountain View, CA Message-ID: <37mp7e$j2f@engnews2.Eng.Sun.COM> Keywords: text, multimedia, hypertext, sgml, hytime, document formating Subject: SVNet Meeting October 19: Extending SGML, Hytime, Hypertext, & Multimedia SVNet Meeting, Wednesday, October 19, 7:30 pm in Mountain View SVNet is a SF Bay area UNIX and Open Systems user's group which sponsors technical presentations at its monthly meetings. The meetings are free and open to the public. The next presentation will be: WHAT: Extending SGML: HyTime, Hypertext, and Multimedia Topics to be covered include: - basic SGML concepts and capabilities - the ``formatting fallacy'' - documents as databases - SGML architectural forms and the object-oriented paradigm - HyTime extensions to SGML - HyTime and inter-operable multimedia WHO: Ralph Ferris, Fujitsu Ralph Ferris is the Project Manager for Electronic Publications at Fujitsu Open Systems Solutions, Inc. (FOSSI), a wholly-owned subsidiary of Fujitsu, Ltd. My work is sponsored by the department within Fujitsu's Middleware Division that is responsible for developing word processing and electronic publishing products. His areas of responsibility include market research on hypertext and multimedia, SGML/HyTime tool development, and representing Fujitsu and FOSSI in various industry work groups. WHEN: Wednesday, October 19, 1994 at 7:30 pm WHERE: Sun Microsystems Bldg 6, 2750 Coast Avenue, Mountain View Coast Ave appears to be just a driveway next to Bldg 5 on Garcia Ave between Amphitheatre Pkwy and San Antonio, so don't get confused. For more information, please call either Paul Fronberg at +1 415 366 6403 or Ralph Barker at +1 408 559 6202. SVNet is a UNIX and open systems user group supported by member dues and donations. SVNet Meetings are FREE and OPEN TO THE PUBLIC. UNIX is a registered trademark of X/Open Newsgroups: comp.text.sgml Date: 14 Oct 1994 21:26:48 UT From: "Christopher R. Maden" \ Organization: Electronic Book Technologies, Inc. Message-ID: \ Subject: Correction re: SGML Author for Word I just spoke with Dale Waldt of \, concerning Microsoft's SGML Author. I was incorrect in my interpretation of his article. Regarding the debate on auto-numbering, Microsoft tells him that anything Word can do, SGML Author for Word can do. That is, since Word can auto-number, SGML Author can as well. He also clarified (and mentioned that his article was written under deadline) that in SGML Author, one is not authoring in SGML. It uses more complicated Word stylesheets to translate the Word document into SGML, in much the same way as Avalanche (whose technology it uses) or DynaTag. In other news, Microsoft asks that people talk to their sales representatives for more information about SGML Author; there are so many rumors going about, and the sales reps have the inspired Word Of Gates. :-) -Chris -- Christopher R. Maden Electronic Book Technologies, Inc. Applications Consultant One Richmond Square crm@ebt.com Providence, Rhode Island 02906 USA +1-401-421-9550 (voice) +1-401-421-9551 (fax) Newsgroups: comp.text.sgml Date: 14 Oct 1994 21:34:18 UT From: "M. James Bartley" \ Organization: Bell-Northern Research Ltd., Ottawa, Canada Message-ID: <37mtgq$l1e@bmerhc5e.bnr.ca> References: <53@direct.iaf.nl> <1994Oct14.184628.21488@tin.monsanto.com> Subject: SGML vs PDF: DSSSL alters the equation? [Joel Finkle] | Personally, I'd like to see a hybrid file format that can serve both | masters: | | 1) A page-turner application that can display the full appearance of | the original paper document (that regulations still require) | | and | | 2) A structured document editor that allows me to search and extract | the portions relevant to me The problem with displaying SGML is deciding how to format it, right? And DSSSL is supposed to provide all the answers to this problem, right again? PDF is great for a lot of things - I've got about a dozen source applications whose files I need to distribute to people and SGML is not an option in this particular case, so PDF is a very nice solution. But, if you *do* have SGML, rendering it into PDF is still going to require formatting decisions (whether you embed the SGML into the PDF or not), so wouldn't it be best to use the standard designed for this? And if you have DSSSL at the origin (the editor), then you're bound to have DSSSL at the destination (the browser), so what do you need PDF or some other (possibly new hybrid) format for? Why not concentrate on getting the DSSSL support into the SGML editors and browsers so that people will be a lot happier about this situation than they are today? How many SGML tools vendors are making efforts to incorporate DSSSL support into their products? And if anyone says that since DSSSL isn't finished yet so it can't be supported, I'd like to remind them that unfinished standards have never stopped vendors from going after potentially lucrative markets before. Just think ... with HTML getting its wrinkles ironed out so that the WWW will finally have the opportunity to do SGML properly in the form of HTML and take advantage of real SGML tools (have you ever tried to actually *find* something on the WWW? -- it's an absolute nightmare), this is the time to show the world what full-blown SGML can do. HTML browsers are free. PDF browsers are now free. If you want to sell your SGML tools, deliver an SGML browser for free. Show people just how minimal HTML is and they'll beat a path to your door to buy your editor. But that browser had better have a relatively decent (albeit basic) feature set, and that means you'll need DSSSL. I've got people here (more and more every day) who are turning Frame docs into HTML by the boatload and they love the results. It makes me cringe in pain because I try to tell them what they're missing but they think that what they have now is the neatest thing since sliced bread. The way to get through to them is to *show* them the real thing, but there's nothing there to show them. Give me that browser and I'll give you customers, but the browser has to do the job properly - not fancy, just right. -- M. James Bartley, Bell-Northern Research Ltd. \\ Internet: mjamesb@bnr.ca P.O. Box 3511, Station C, Ottawa, Canada, K1Y 4H7 \\ Phone: +1 613 763 3556 ** Opinions expressed are mine, not BNR's. - MJB ** \\ Fax: +1 613 763 5048 "Stupid pigeons!" Newsgroups: comp.text.sgml Date: 14 Oct 1994 22:04:58 UT From: Russ Chamberlain \ Organization: WATFAC Message-ID: \ References: <3762qj$4p6@finnegan.iol.ie> Keywords: minor, problems Summary: I've done it with WATCOM C for DOS. Subject: Re: YASP, does anyone have a MS-DOS copy? [Charles Hurley] | Does anyone have a MS-DOS copy of YASP? [Sean Mc Grath] | I'd like to piggy back on this one if I may. I have looked through the | compilation scripts, etc., for YASP but I do not have access to an OS/2 | box to compile up a DOS version as required. | | I guess with a bit of work the REXX script could be analysed to produce | a DOS makefile for Microsoft or Borland and the code compiled directly | on a DOS based PC. If someone has already been done, I would | appreciate any information. I managed it in about a day, using WATCOM C. Yes, re-creating the build scripts is a requirement, but it's not too hard. For DOS, create the TOOL, MDC, and CTX libraries. In each makefile, just keep the source-file dependencies, tear out everything else, and insert your favourite compiler and library directives. A key file to have a look at is PARSER\\INCLUDE\\PORTABLE.H. This file contains all sorts of definitions that will vary depending on the target machine, operating system, and memory model. For some reason unknown to me, the pascal and cdecl calling conventions are used extensively. Fortunately, it was a simple matter to define the relevant macros as empty. Note that at least one file, OCSMIN.H (I think), has some missing end-of- comment sequences (*/). I recall having a handful of _very_minor_ problems, one related to the use of const. Otherwise, all files compiled up just fine. The same makefile that makes the CTX library combines the three libraries into the YASP library. As for the test applications, SMP00 is the one you need _and_ require to make first. This application generates OCS (Compiled SGML Declaration) and ODT (Compiled DTD) files. Note that the makefiles aren't kidding when they define a 40K stack. I used a 20K stack on small documents, and sometimes got a stack overflow. To get error messages, make sure the file called YSP.TXT is in the current directory. The RAST application just spits out the ESIS, but in a different format than SGMLS does. Speaking of SGMLS, I'd like to thank the people responsible for creating YASP. Such a clean, elegant, _understandable_, and *free* solution is, IMHO, a great stride forward for SGML and its supporters. I know I speak for many of us when I say: THANK YOU! THANK YOU! THANK YOU! THANK YOU! THANK YOU! THANK YOU! Good luck! - Russ PS - Sorry, I've only got advice today, and no binaries :-( Sorry this is terse, but I gotta go... -- Russ Chamberlain Waterloo Foundation For The Advancement Of Computing (WATFAC) russc@csg.uwaterloo.ca >> "The above are MY views, not CSG's, UW's or WATFAC's" Newsgroups: comp.text.sgml Date: 14 Oct 1994 22:13:28 UT From: "Steven J. DeRose" \ Organization: EBT, Inc. Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> \ Subject: Re: Storage Models for SGML data [Peter Flynn] | This is crucial and remains unaddressed by any SGML software I have | seen. Before you all jump on me and tell me (a) you can say \
or similar; or (b) that editors etc can activate on-screen | section numbering deduced from \
or similar elements, what I mean | is support for Canonical Reference. | | If I am in the midst of an instance, having got there via, say, a | search routine which has returned the exact occurrence of "foo" that I | was looking for, I now want to know where I am: I want the database | system to return not just "`foo' found at offset 123456" but something | like "`foo' found in \ #14 in \ #5 in \ #8 in \ | #3", according to some rules I specified before the search saying that | for all hits I want the structure returned using such-and-such a | hierarchy. In DynaText(tm), use the TEIgen() property-value function. It generates a TEI-conformant extended pointer to whatever element you specify. It supports three variants of the TEI syntax: Either TPATH or PATH from version P1 of the TEI guidelines (now replaced), or the P3 (current) form. P3 is the default; P1 TPATH is almost exactly like your example (as of course you know, Peter). teigen() is smart enough to use the nearest ancestor with an ID and then work down the tree path from there, so you get as robust a pointer as possible. If you're an end user you can't do a lot with the pointer other than display it unless your publisher gives you a place to store it and use it later. But if you have a document using extended pointers, you can set up links in your stylesheet easily to interpret it, using our tei() function. The temporary snag is that tei() only supports P1 syntax so far. The TEI syntax has also been adopted by some other software, and given the enormous amount of data being encoded with TEI, it is incumbent on implementors to support it (it's also dead simple to do -- my C code for interpreting the P1 syntax is 163 lines including comments). See our doc for more details on all the above, but here's a trivial example of a TEI P1 link using the TPATH syntax (the same example can be used across documents merely by naming the destination entity too): \ I leave the corresponding HyTime syntax as an exercise for the reader. Or, there's a Hytimegen() function for y'all to play with in DynaText. -- Steven J. DeRose Sr. System Architect Electronic Book Technologies, inc. Newsgroups: comp.text.sgml Date: 14 Oct 1994 23:25:07 UT From: Earl Hood \ Organization: Engineering, Convex Computer Corporation, Richardson, Tx USA Message-ID: <37n40j$715@imagine.convex.com> References: <37m3ns$j7c@hermes.achilles.net> Subject: Re: DTD2HTML [Erik Ableson] | I recently ran across a msg containing a reference to a utility called | DTD2HTML, designed to simplify the process of converting SGML data into | a format suitable for use with a WWW server. | | If anyone can point me to a site that has this software (or the mfg, if | it's commercial) it would be appreciated. dtd2html does not convert SGML documents into HTML. What it does do is convert a DTD into an HTML document. It allows hypertext navigation of the structure defined by the DTD, and the ability of adding element and attribute descriptions into the HTML document. dtd2html is free, and the latest information on its availability is at \. If you are designing a DTD, or you want to learn about an existing DTD, dtd2html can prove to be useful. --ewh Newsgroups: comp.text.sgml Date: 15 Oct 1994 00:30:37 UT From: Peter Flynn \ Organization: University College Cork Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> \ \ Subject: Re: Storage Models for SGML data [Steven J. DeRose] [Canonical references] | In DynaText(tm), use the TEIgen() property-value function. It | generates a TEI-conformant extended pointer to whatever element you | specify. It supports three variants of the TEI syntax: Either TPATH or | PATH from version P1 of the TEI guidelines (now replaced), or the P3 | (current) form. P3 is the default; P1 TPATH is almost exactly like | your example (as of course you know, Peter). teigen() is smart enough | to use the nearest ancestor with an ID and then work down the tree path | from there, so you get as robust a pointer as possible. So if I'm browsing a text and I find a phrase I want to cite in my thesis, I can call TEIgen to provide me with the chapter and verse? | If you're an end user you can't do a lot with the pointer other than | display it unless your publisher gives you a place to store it and use | it later. But if you have a document using extended pointers, you can | set up links in your stylesheet easily to interpret it, using our tei() | function. The temporary snag is that tei() only supports P1 syntax so | far. My guess is that in 99% of the cases, display is all that is needed: oh, maybe snip it out onto the clipboard or paste it into some edit window. P3 would be needed at some stage, of course :-) This is good news, Steve, thanks...how long has this been available in DynaBook? $64,000 question: ultimately, a lot of the texts we have will need to be available thru WWW. We (and some other people) are working on various methods of converting/exporting/subsetting TEI text into HTML (losing content markup, of course, but adding a useful access facility). To make some form of reference back to the original SGML involves links, which are not a problem to construct, but to perform useful work with them, the WWW browser must call a script which will run a search engine and return the results (reformatted into HTML on the fly). Currently I'm using PAT to do this because it works in line-mode. Do you have (would it be possible to make) a line-mode version of DynaBook that could run silently from a script to do this kind of thing? Have a look at http://www.ucc.ie/curia/texts/oengus.html to see what I mean. ///Peter -- Peter Flynn | pflynn@curia.ucc.ie | ...persuade users that spreading Computer Centre | +353 21 276871 x2609 | fonts across the page like peanut University College | +353 21 277194 (fax) | butter across hot toast is not the Cork, Ireland | Opinions are my own. | route to typographic excellence... Newsgroups: comp.text.sgml Date: 15 Oct 1994 00:53:30 UT From: Howard Kaikow \ Organization: MV Communications, Inc. Message-ID: \ References: \ Subject: Re: Correction re: SGML Author for Word [Christopher R. Maden] | I just spoke with Dale Waldt of \, concerning Microsoft's SGML | Author. I was incorrect in my interpretation of his article. | Regarding the debate on auto-numbering, Microsoft tells him that | anything Word can do, SGML Author for Word can do. That is, since Word | can auto-number, SGML Author can as well. In my previous comments, I assumed that. However, Word has no way to distinguish more than 1 type of heading, i.e., one for the normal use and another for dictionary style definitions, so a conforming SGML input file could not be properly processed by Word and Word could not out the equivalent file, if it is necessary to distinguish such tagged elements. Remember that I am not talking about hard coding things in Word. I am assuming full auto numbering and table of contents creation. Newsgroups: comp.text.sgml Date: 15 Oct 1994 06:42:10 UT From: Richard Wal Matzen \ Organization: Oklahoma State University, Computer Science Message-ID: \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Mark Walter] | Sorry if I implied it exports style info. It does not. Microsoft | believes that's what RTF is for :-) [W. Eliot Kimber] | ... In other words, the purpose of Word's SGML tools is not really to | allow you to export any arbitrary Word document in some sort of | equivalent SGML form, but to allow you to create in Word documents that | conform to a specific document type. The same is true of Word | Perfect's Intellitag. A more complete solution would be: allow you to create WORD (or WP) documents that conform to a specific document type, export them as SGML document instances, ... AND export the style information in an efficient and useful manner (associated with the DTD or document instances). [Mark Walter] | In simple terms, it maps Word document template styles to SGML elements | and attributes,... I dont understand yet exactly what this mapping implies. Does it mean that the user can map a style to an element type (or an element in context), so that when the SGML document instance is output, all text in the WORD document that is associated with (attached to) that style is output as #PCDATA within an occurrence of the element? If this approach is used, then you could export the style's formatting information manually (FOSI like, for each such element/element in context). It should also then be feasible to output it automatically, either in an external definition similar to FOSIs or with some other scheme. I understand that there is more complexity to the problem than a simple one style to one element (or to one e-i-c). However, the above approach would be useful, and I can't yet see why it could not coexist with methods for other constructs (such as structural elements that do not have any associated style). [W. Eliot Kimber] | The Microsoft people impressed me with their candor and their | understanding of SGML and the limitations of their product. They | appear to have avoided the mistakes of their predecessors and have not | tried to oversell what they're offering.... I am anxious to see their product. This level of commitment from Microsoft is a significant event. It is also good to hear that they are sensitive to the problem of overselling SGML tools, which has been a nagging industry problem. The formatting export issue still remains a problem. Yes, there is RTF, but although its improving, it still lacks a good formal definition, and it will probably always be a moving target. Hopefully, Microsoft will will eventually offer a better and more direct solution. Rick Matzen Newsgroups: comp.text.sgml Date: 15 Oct 1994 12:18:16 UT From: Rick Jelliffe \ Message-ID: \ Subject: Abandon ASCII for EBCDIC now! (Was Re: multilingual HTML, SGML) [Liam R. E. Quin] | Don't rule out Unicode or ISO 10646 because you don't have a 32-bit | computer! [Richard L. Goerwitz] | Again, this has nothing to do with what I was saying - although it's a | perfectly valid point to make here. If we are going to demand that all these foreigners who have mad character sets all adopt a unified character system, lets first adopt EBCDIC instead of ASCII ourselves. That way we will inflict on ourselves the same kind of needless transliterations that we ask of them: it is only fair. But it will make software so much easier to write... -ricko -- Rick Jelliffe email: ricko@allette.com.au Allette Systems phone: 2-262-4777 Sydney, Australia fax: 2-262-4774 Newsgroups: comp.text.sgml Date: 15 Oct 1994 19:18:43 UT From: Betty Harvey \ Organization: Advanced Information Systems Branch, DTMB, CDNSWC Message-ID: <37p9ujINNo53@oasys.dt.navy.mil> References: \ \ <9410122107.AA2519@notes.microstar.com> Subject: Re: Microstar [Chris Biber] | Being able to visualize the structure of your DTD is in itself of great | benefit to document authors and DTD creators. Being able to visually | _create_ the DTD brings measurable productivity increases to the | process of DTD design. | | NEAR & FAR's interaction with a central tag and structure library (CADE | Groupware) creates a complete document analysis and design environment. | | I would appreciate any feedback from N\&F users out there - Case | studies, problems encountered, suggestions are all welcome. Well since you asked |-). I think Near and Far is a great product. I haven't gotten to the point where I can 'read a DTD like a novel' yet so I need all the help I can get. I not only use it to help me navigate and understand the content models, I also use it to modify existing DTD's. It really is a wonderful tool. However, there are a few improvements I would like to see made. These are three improvements I would like to see made off the top of my head. 1. When exporting the internal Near and Far model to an ASCII DTD allow the option of keeping the Elements and its Attributes together in the file. 2. The reports function is good but is limited in its current implementation. The printing functions is too slow. Allow the reports to be exported into other formats, namely ASCII. I often compare several DTD's or iterations of DTD's and use other software tools, i.e., database or spreadsheet. I had to install an ASCII printer then print to a file (still slow). 3. Have a means to view the entire DTD hierarchy on a single screen or have a smaller window with the entire hierarchy so you can see where you are. Betty -- Betty Harvey \ | David Taylor Model Basin Advanced Information Systems Branch | Carderock Division Code 183 | Naval Surface Warfare Bethesda, Md. 20084-5000 | Center | DTMB,CD,NSWC URL: http://navysgml.dt.navy.mil/betty.html | Newsgroups: comp.text.sgml Date: 15 Oct 1994 21:04:01 UT From: Kent Summers \ Organization: Electronic Book Technologies, Inc. Message-ID: <37pg41$g1a@spock.ebt.com> References: <53@direct.iaf.nl> <1994Oct14.184628.21488@tin.monsanto.com> Subject: Re: Why SGML and NOT Acrobat [Joel Finkle] | Personally, I'd like to see a hybrid file format that can serve both | masters: | | 1) A page-turner application that can display the full appearance of | the original paper document (that regulations still require) | | and | | 2) A structured document editor that allows me to search and extract | the portions relevant to me why stop there? sun micro and ebt announced at seybold: co-development of a "universal document viewer" that can handle (browse, search, etc.) a wide variety of document formats --> SGML, HTML, dynatext e-books, answerbook, SDL (CDE's format), PDF, among others, will be supported. Newsgroups: comp.text.sgml Date: 15 Oct 1994 21:45:20 UT From: Erik Naggum \ Organization: Naggum Software, +47 2295 0313 Message-ID: <19941015T214521Z.enag@ifi.uio.no> References: \ Subject: Re: Abandon ASCII for EBCDIC now! (Was Re: multilingual HTML, SGML) [Rick Jelliffe] | If we are going to demand that all these foreigners who have mad | character sets all adopt a unified character system, lets first adopt | EBCDIC instead of ASCII ourselves. | | That way we will inflict on ourselves the same kind of needless | transliterations that we ask of them: it is only fair. But it will | make software so much easier to write... It won't be as easy to move from ASCII to UCS-2 as you appear to think. Because of hardware programming languages like C and its derivatives, it will require much arduous work before we get there. Had people used software programming languages with real and distinct `character' types, most of the stupidity could have been avoided. Character encoding is non-trivial. Internal and external representations frequently differ in the real world. Codes _always_ have context dependent semantics. Character strings are frequently external representations of other objects which make such strings atomic even though they consist of several individual characters. A code is a distinct concept from an integer, even though they can have the same bit representation. The same is obviously true of floating point numbers and pointers (addresses). Some of these problems manifest themselves with unprecedented force in text processing applications. Languages to be used in such environments should make an all-out effort to handle the complexities involved, and should not catch a ride with hardware programming languages and their implementations of the character concept. With a suitable level of abstraction, it no longer matters which encoding is used. The folly of insufficient abstraction from the encodings has made application programs susceptible to all kinds of easily avoidable ills. It matters little whether the encoding is 8 or 16 or 32 bits wide, or whether it is encoded as a multiple of 8-bit or 16-bit units, or with a multitude of character sets in various code spaces. Characters have meaning independent of their encoding, and as long as a program can agree with itself on an internal encoding that it can find useful, communication with the outside world can occur on any premises whatsoever. General understanding and application of these principles will have to precede deployment of character sets. I don't think the current trend in character set encoding and character representation in simple-minded programming languages will make this easier or harder for any particular character set. #\ -- Microsoft is not the answer. Microsoft is the question. NO is the answer. Newsgroups: comp.text.sgml Date: 15 Oct 1994 22:21:35 UT From: David J Birnbaum \ Organization: University of Pittsburgh Message-ID: <37pklf$5ji@usenet.srv.cis.pitt.edu> References: <3762qj$4p6@finnegan.iol.ie> Subject: Re: YASP, does anyone have a MS-DOS copy? [Sean Mc Grath] | Either which way, a DOS binary of YASP compiled using the existing OS/2 | hosted compilation process would be very useful. Er ... so would a native OS/2 binary. I have an OS/2 box, but I don't have the requisite compiler. Sigh. If an OS/2 binary appears, I'd be grateful for a pointer to it. Thanks, David -- Professor David J. Birnbaum djbpitt+@pitt.edu The Royal York Apartments, #802 3955 Bigelow Boulevard voice: 1-412-687-4653 Pittsburgh, PA 15213 USA fax: 1-412-624-9714 Newsgroups: comp.text.sgml Date: 16 Oct 1994 15:13:57 UT From: Gavin Nicol \ Organization: Electronic Book Technologies, Inc. Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> \ Subject: Re: Storage Models for SGML data DynaWeb(tm) will initially support 3 methods of sub-document addressing: \ \ \ where "nnnnn" represent an element ID within DynaText. For those who don't know, DynaWeb(tm) allows people to publish DynaText(tm) books on the Internet with little additional effort. DynaWeb(tm) appears as an HTTP compatible server to WWW users, with some extensions to the normal server functionality. Newsgroups: comp.text.sgml Date: 16 Oct 1994 23:25:55 UT From: Erik Naggum \ Organization: Naggum Software, +47 2295 0313 Message-ID: <19941016T232555Z.enag@naggum.no> References: <53@direct.iaf.nl> <1994Oct14.184628.21488@tin.monsanto.com> Subject: Re: Why SGML and NOT Acrobat [Joel Finkle] | Displaying an SGML doc on the fly is slow, especially when you want to | see something in the middle. Having just a page-turner limits the | content-based retrieval. A cross-breed is what I'd like to see. I do not write page-turners or formatting applications. However, I know about programming and about smarter ways to do things than linear search, which many non-programmers seem to assume for SGML. If I were to write a page-turner or formatting application, I would read the entire SGML document once and build an internal representation that is better suited to my needs than a linear representation. (Lisp interpreters do not process long lists of nested parenthetical expressions, they process an internal representation built by the lisp reader function from the external, linear representation -- similarly for SGML.) One easy way to accomplish this internal representation is to use "markers" into the external (linear) representation and then maintain the necessary environmental information in the internal representation. Given this, it should be relatively comfortable to move to a particular point in the SGML document and read data from there, and one would know enough about the history of the parse at this point to do intelligent things quickly. This is not trivial business, but PostScript is not trivial, either. The key is to use languages that enable such processes comfortably. (Comfort and simplicity are not necessarily related.) We observe that PostScript is not comfortable enough for the page-turners, so they created PDF. A not dissimilar effort was started years ago with the SGML-B project, to obtain a binary representation for SGML. For single-purpose languages, it makes sense to standardize an external representation of the processing state just prior to the `showpage', making the `showpage' very quick later. For multi-purpose languages such as SGML, it makes much less sense, primarily because there are no clear points in the processing when it is obviously useful to save the system state. Therefore, SGML systems must of necessity invent their own internal representations and their own optimized external representations of this information. That they have to invent their own format is not an argument for not doing it and instead using linear search through a long character stream to find things. #\ -- Microsoft is not the answer. Microsoft is the question. NO is the answer. Newsgroups: comp.text.sgml Date: 17 Oct 1994 01:34:49 UT From: Peter Kerr \ Organization: School of Music University of Auckland Message-ID: \ References: \ <1994Sep30.024603.29919@midway.uchicago.edu> <1994Oct13.184655.26716@sq.sq.com> <1994Oct14.051527.25155@midway.uchicago.edu> Subject: Re: Multilingual HTML, SGML documents? [Richard L. Goerwitz] | Okay, now try fully cantillated Hebrew, or Quranic Arabic, or (better | yet) Akkadian cuneiform. Or just try to run an import / export / shipping / publishing / etc business in the few square miles between Portugal - Hungary - Norway. ISO 10646 is at least an attempt to standardise glyph mapping. Now a *language attribute* might permit context sensitive intelligent glyph mapping, and that would be very useful too. But what would keep the customers happy in the meantime is that their existing document writing reading and transport software should be able to use a standard glyph map (any standard!) and keep it intact, from the finger-keyboard interface thru to the monitor-eyeball interface. Of course I realise that this problem is not confined to HTML / SGML. -- Peter Kerr bodger School of Music chandler University of Auckland neo-Luddite Newsgroups: comp.text.sgml Date: 17 Oct 1994 03:10:07 UT From: Mike Austin \ Message-ID: \ Subject: Variables in HTLM documents? I've got down general HTML and forms, but it _looks_ like there is a way to define variables in a document. I all want to do is define an ftp directory that I use over and over. Thanks! -- cout << "cman@netcom.com" << endl; << I get more junk e-mail than junk mail! >> Newsgroups: comp.text.sgml Date: 17 Oct 1994 06:20:38 UT From: Sami-Jaakko Tikka \ Organization: Helsinki University of Technology, CS lab Message-ID: <37t53mINN1pb@tahma.cs.hut.fi> Keywords: sgml, ventura, zander tagwrite, html, dtd Subject: Info about Ventura 5.0 and Zander Tagwrite? Hello! First you should know that I am a novice about all things concerning SGML. I am supposed to begin converting a newsletter to HTML format. The newsletter is written with Ventura 5.0 and together with the fact that Ventura comes equipped with an SGML program called Zander Tagwrite leads me to believe that the conversion could be possible. I am thinking that perhaps it is possible to feed the Ventura document together with HTML DTD to the Zander Tagwrite and have it produce HTML for me. Now does anyone know the addresses or other contact info for the authors of Ventura or Zander Tagwrite? I'd like to ask them if the procedure I've described above is possible. Or is anyone of you able to comment on that? Additionally, does anyone know of similar SGML converters for Unix, Windows and Mac platforms? I mean like converters that can read some word processor format and a DTD and produce another format from that. The DTD I'm thinking about would be that of HTML. Finally, am I correct in supposing that DTD is a well defined format and that any converter program claiming to do SGML should be able to read a text file describing the DTD and produce something reasonable? Thank you in advance. -- \Sami Tikka\ "Peace and Long Life." "Live Long and Prosper." Newsgroups: comp.text.frame,comp.text.sgml Date: 17 Oct 1994 09:51:40 UT From: Steve Holden \ Organization: Scotia Electronic Publishing Message-ID: <782387500snz@scotia.co.uk> References: <37ljg6$26l@elna.ethz.ch> Subject: Re: SGML-related Frame Products: no university licensing?! [Hans Ulrich Wiedmer] | I am told by a local Frame distributor that there is no university | licensing available. This holds for every Frame Product related to | SGML (i.e., FrameBuilder, SGML Toolkit, SGML Utilities etc.). It | sounds like this is just the policy of Frame Technology Inc. | | DO OTHER UNIVERSITIES EXPERIENCE THE SAME SITUATION ?????? | | - if yes: is there an understandable reason for Frame Technology to do | so? Well obviously I can't speak for Frame. I ran the first reseller in the UK for about five years. During that five years I spent quite some time encouraging Frame to compete in the UK academic market against a very effective InterGrief promotion where licences were available at nominal cost. | - if no: could anybody give me an address of a Frame distributor in | the "neighbourhood" (whatever that may mean, let's say Europe). Frame steadfastly ignored such advice [their privilege] for about four years. Just after I spent three weeks in intensive negotiations with a major UK university they announced [without having told their resellers first] that a deal had been arranged with a UK body called CHEST, where licenses would be available for less a small fraction of the standarprice. But resellers were cut out of this deal due to low margins! I understand that dealers in the UK can now sell to academic sites at the special pricing. My advice would be to get your resellers to apply pressure to Frame's European distribution staff based in Dublin, and threaten to go to a UK reseller. Experience shows that Frame's attitude to pan-European dealing was, shall we say, eccentric -- and may not comply with various EU directives! CAVEAT: I am no longer involved in reselling FrameMaker or its derivatives. -- Steve Holden Scotia Electronic Publishing +---------------------------+ Tel: +44 436 678962 29 John Street | Tools, Training and | Fax: +44 436 677814 (bureau) Helensburgh | Technology for Successful | Internet: steveh@scotia.co.uk Scotland | Electronic Publishing | Compuserve: 100343,2205 G84 8XL +---------------------------+ Newsgroups: comp.text.sgml Date: 17 Oct 1994 10:09:44 UT From: Markus Kuhn \ Organization: Student Pool, CSD, University of Erlangen, Germany Message-ID: <37tih8Eemq@uni-erlangen.de> References: \ <19941015T214521Z.enag@ifi.uio.no> Subject: "was: Abandon ASCII for EBCDIC now!" \ [Erik Naggum] | It won't be as easy to move from ASCII to UCS-2 as you appear to think. | | Because of hardware programming languages like C and its derivatives, | it will requrie much ardous work before we get there. Had people used | software programming languages with real and distinct `character' | types, most of the stupidity could have been avoided. However, there is a wonderful practical compromise between the ASCII compatibility some systems (especially C/UNIX) require and the need for a 16- or even 31-bit characterset: The UTF-8 encoding of ISO 10646. It is simple, easy to remember, ASCII compatible and has a lot of nice properties. For those interested, please have a look at ftp.uni-erlangen.de pub/doc/ISO/charsets/utf-8.c (contains all the documentation and some simple demo code). As SGML is a _very_ text file (ASCII) oriented system (well, in pure theory it is not, but the whole design of SGML is focused on allowing to use editors like vi and emacs for SGML file handling), using UTF-8 instead of UCS-2 would be a consequent continuation of this design goal. Markus -- Markus Kuhn, Computer Science student -- University of Erlangen, Internet Mail: \ - Germany WWW Home: \ Newsgroups: comp.text.sgml Date: 17 Oct 1994 10:38:03 UT From: Piero Mancino \ Organization: Ericsson Telecom Message-ID: \ References: <9410140838.AA04551@lars.texcel.no> Subject: Re: SGML and the EEC There has been an ODA conference last June in Brussels in which 2 persons from EC attended. They were directly involved with the use of ODA in the EC. Here are their addresses, if you need more information you can e-mail me. I hope this will help, /Piero Ken Thompson (Legislation and Standardization and Telematics Network) Tel. + 32 2 296 89 92 Fax + 32 2 296 95 00 X.400 C=BE; A=RTT; P=CEC; O=DG3; S=THOMPSON; G=KEN Jean Piette (Informatics Directorate-Rules and Coherence for Informatics) Tel. + 352 4301 33677 Fax + 352 4301 33869 X.400 C=be; A=rtt; P=cec;OU=di;S=piette Newsgroups: comp.text.sgml Date: 17 Oct 1994 10:53:05 UT From: Peter Witt \ Organization: Ericsson Telecom Message-ID: <37tl2h$jvt@erinews.ericsson.se> References: <36p76b$c1@erinews.ericsson.se> Subject: Re: SGML to Postscript formatter I have been off the net for a week and I see now that I have got great response on the high speed formatting issue for SGML to Postscript. I have included the articles mailed on this subject with my comments/more information below. [Bob Agnew] | There are two prevalent standards for the formatting of SGML tagged | data, FOSIs and DSSSL. FOSIs are governed by the outspec DTD in | Mil-Std-28001B Appendix B. FOSIs are supported by Arbortext and | DataLogic. DSSSL? [Peter Flynn] | What about SGML-->TeX-->PostScript? I wrote a pilot SGML2TeX which you | can pick up by ftp in pub/sgml on curia.ucc.ie. At least using TeX | overcomes any portability problems. Peter, I will get program to test it here, but first I have to find out how I can fecth files with mail ftp. I have not tried it before so this will be my first attempt. I have no direct access to Internet from where I work, but our company has a central node where I need to have an account on. What more do I need after I get a TeX output from the SGML2TeX? A TeX to Postcript formatter? How do you specify the formatting data for SGML2TeX? FOSI? [Warren Baird] | The system we're using here consists of a shell script that calls sgmls | and sgmlsasp, and optionally calls a perl script with the output of | sgmlsasp to polish the output. | | We either go directly from SGML to HTML for online viewing, or go from | SGML to LaTeX, and then from LaTeX to PostScript for printing. | | Going directly from SGML to Postscript is feasible, but probably not | practical. Why not? OK ;-), we don't have to go all the way to postscript, but the other machines we are using expects postscript. For printing of ps on our high volume production XEROX printers we use Fibre 600 software from ENTIRE to create IMG formatted documents that can be printed. | With our system, going from SGML to HTML or LaTeX is quite fast, but | going from LaTeX to PS is as slow (or as fast) as running LaTeX | normally is. [Peter Bomberg] | since you did not mention the products that you have looked at I am not | sure if I'm just giving you old information. | | I have looked closely at using one of four products. CAPS (xerox), | Pager (datalogics), I6sgml (interleaf) or arbortext publisher | (arbortext). They all have pro's and con's but if you wan't any | specific info then just ask :} | | ps. What products have you looked at? (and what did I miss in my | reaserch) ;-) I have looked at the following products: Product and supplier: Formatting data: ------------------ ---------------- SGML Translator from IBM proprietary Parlance Publisher from XY Systems proprietary DL Composer from Datalogics FOSI CAPS from XSoft (Xerox) proprietary (FOSI will soon implemented they say) ArborText ADEPT series from ArborText FOSI As you see no one is talking about DSSSL and have'nt heard from my contacts at the above suppliers that DSSSL will be considered either. [John Vernon] | Can you give me any contact details for any of these two companies? I | am looking for an SGML to Postscript converter. | | The main concern is ease of use, (e.g., FOSI defined style info is | fime, but something simpler in addition would help), performance and | availability across platforms. | | Can you point advise me where I might find such a converter. Newsgroups: comp.text.sgml Date: 17 Oct 1994 12:22:31 UT From: James Clark \ Organization: None, London, England Message-ID: \ Subject: a new SGML parser \ available for testing I have been writing a new SGML parser, tentatively called SP, over the last couple of years. My main goal was to provide a solid base for the SGML applications that I wanted to write in the future. But I think it now has sufficient functionality that it may be useful to others. So I've decided to make it available for alpha-testing. The code is copyrighted under the same terms as X11R6; these allow commercial use. Here are some of the features of SP: - Written from scratch in C++. - Supports any concrete syntax allowed by the standard. - Supports all varieties of LINK (but the length of chains of LINK processes is limited to 1.) - Reentrant: a single process can use multiple parsers at the same time. - Includes application (nsgmls) that generates sgmls output format and is command-line compatible with sgmls. - Includes application to generate RAST output format. (Converting sgmls output to RAST format isn't sufficient for testing LINK.) - Supports large character sets. It can be compiled to use 16-bit characters internally (32-bits is also a possibility for the future). This allows the entity manager to deal with all the messy details of multi-byte encodings. System identifiers can specify which coding scheme a file uses. Supported coding systems include UTF-8 and Unicode/UCS-2 (intended for use with ISO 10646) and UJIS/EUC and Shift-JIS (for Japanese character sets). - Fast for large documents. For example, on my machine (a sparc 10), it processes the 7Mb Exoterica test document (lcrw.sgm) roughly twice as fast as sgmls. On the other hand, it's slower than sgmls at parsing prologs. It also uses much more memory than sgmls. I haven't looked at its performance on other machines, and it may well be worse. It doesn't support any of the following: - CONCUR (it will parse documents that use CONCUR but you can't make document types active). - DATATAG (support for arbitrary short references makes DATATAG of limited usefulness in my view.) - CAPACITY checking. I would emphasize that this is an alpha release. This means: - It hasn't been tested very much (although I have run it on one large SGML test suite). - There isn't much documentation. In particular, there's no documentation on the parser interface. On the other hand, by reading the header files and looking at the included applications, you should be able to figure out how to use it. - I haven't done much release engineering. There's no fancy configure script. You may well encounter difficulties building it if your system differs from mine. - The parser is still under active development. All of the interfaces are subject to change. My development platform is a Sparc 10/Solaris 2.3 with gcc 2.6.0. Earlier versions of gcc will not work. An iostreams library is needed (such as is included with libg++). You should have little difficulty building on SunOS 4.1, since I only recently moved from that to Solaris. I have also compiled it with Sun C++ 4.0, but I wouldn't recommend using this. The code is in theory quite portable, but it uses templates very extensively, and the mechanisms for instantiating templates vary greatly between compilers, and furthermore many template implementations are rather buggy. So I would recommend using gcc if you can. The source code uses long filenames, so you will not be able to build it on MS-DOS/Windows. (If you want to attempt to build an MS-DOS/Windows version I suggest you try building a Win32s version on Windows NT.) You can get it by anonymous ftp from ftp.jclark.com:/pub/sp/sp.tar.gz. Binaries for a sparc running solaris 2.3 are available in the /pub/sp/sparc-sun-solaris2.3 directory. Man pages are included in the main distribution but are also available separately in the pub/sp/man directory for the benefit of those using the binaries. I would be happy to provide space for binaries for other architectures. I would very much like to hear about any bugs you find. I am also interested in hearing about any places where the code does not conform to the current ANSI C++ working paper. At this point, I'm less interested in hearing about bugs in compilers that prevent them from compiling SP. Although the code is all new, the parser uses many ideas from Charles Goldfarb's ARCSGML parser and the entity manager uses a number of ideas from Erik Naggum's POEM. I thank them both. James Clark jjc@jclark.com Newsgroups: comp.text.sgml Date: 17 Oct 1994 13:07:26 UT From: Jacques Deseyne \ Organization: EuroKom Conferencing Service Message-ID: <1994Oct17.150726.14369@eurokom.ie> Subject: Re: SGML and the EEC I don't think that the EEC is dropping SGML in favor of ODA. Indeed, a few projects are set up to test ODA for communication of documents, but this doesn't mean that investments in SGML will be given up. On the contrary. The IDA program (Interchange of Documents between Administrations) has one project where conversion of documents in ODIF (ODA Interchange Format) will be tested out. This is for really _BLIND_ exchange without agreements on structure, which is what ODA was invented for. Information on this project should be available from Mr. Guy Rossignol, European Commission, DG III/B/5, Office R3 7/21, 200, Rue de la Loi, b-1049 Brussels, Fax. +32 2 299 02 86. The IDA program is also involved in using SGML! A whole bunch of SGML applications are set up at the EEC and will be extended to other environments of users. New applications will be set up. As an example, the Publications Office is quite involved in using SGML for paper-based and electronic publishing. I don't think ODA fits their needs. The best information point is probably the Open Information Interchange initiative, part of the Impact program. I don't find the contact details right now, but it is in Luxembourg. A few initiatives from the EEC have invested in the development of the ODA standard and continue to promote its use for specific purposes. From the implementation point of view, the number of SGML-based installations far exceeds the installations using ODA. We were involved in testing out both SGML and ODA for conversion between different word processing formats at the EEC, in the context of the "EuroLook" documents. EuroLook is a set of structured documents which can be created in customized word processor environments. It appeared from these experiments that current ODA converters lose all structure information (i.e., none takes the Generic Logical Structure, which is more or less the ODA equivalent for a DTD, into account). Only the converter from Bull (RTF to ODIF) performed somewhat better. For the SGML conversions, it appeared that you need high-quality conversion environments (such as OmniMark, FastTag, ...) for good results. Nothing in my contacts with a lot of people at the Commission showed anything which would lead to conclude that the use of SGML would be dropped in favor of ODA. The nature of this rumor seems to be inspired by the age-old hostility which appears to exist between SGML and ODA adepts. Little can be done to appease the animosity, but every side should be aware that both standards were conceived for really different purposes. SGML is probably well known to the readers of this news group; ODA was conceived for the blind exchange of documents in an OSI environment (its acronym meant originally "Office Document Architecture", later changed to "Open Document Architecture"). I would be happy to know the source of these rumors! -- Jacques Deseyne jdeseyne@eurokom.ie Avenue Louise 250, box 14 B-1050 Brussels Newsgroups: comp.text.sgml Date: 17 Oct 1994 14:17:55 UT From: Paul Hermans <71140.2011@compuserve.com> Message-ID: <941017141755_71140.2011_HHE36-1@CompuServe.COM> Subject: SGML BeLux -- new address Pro Text, chair of the Belgian-Luxembourgian chapter of the International SGML Users' Group, SGML BeLux vzw, has moved to: Interleuvenlaan 62 3001 Heverlee +32/16/40 66 81 (voice) +32/16/40 66 91 or 40 01 35 (fax) Newsgroups: comp.text.sgml Date: 17 Oct 1994 15:05:18 UT From: Christophe Espert \ Organization: A poorly-installed InterNetNews site Message-ID: <37u3re$72d@edf3.der.edf.fr> Subject: YASP compilation on Unix platforms \ Regarding YASP's compilation on Unix platforms, if you encounter some problems with the GNU GCC compiler, read this. You may encounter some problems compiling the server.c file necessary for the SMP00 and the RAST applications. You should use g++ with the -x c option. This will imply a correct linking of libraries. Moreover you may have some problems at linking time with the fsetpos and fgetpos functions. You can replace them with ftell and fseek respectively. Scan through the server.c file and find the lines where fsetpos and fgetpos are used, then do this: Replace if ((rc = fsetpos(f_p->fd, \&posblk_p->fpos)) != 0) { with if ((posblk_p->fpos = ftell(f_p->fd)) < 0) { rc = 0; and replace if ((rc = fgetpos(f_p->fd, &(sra_p->posblk.fpos))) != 0) { with if ((rc = fseek(f_p->fd, sra_p->posblk.fpos, SEEK_SET))) { Finally add the following statements right after the #if (defined(M_I86)) #endif statements at the top of the file: #if (defined(GNUCC)) #define FILENAME_MAX 512 #define fpos_t long #define SEEK_SET 0 #endif I hope it is all clear :-) Good luck. Best regards, -- Christophe Espert - E-mail: espert@cln46fw.der.edf.fr EDF - DER | High Text 1, Avenue du General de Gaulle | 5, rue d'Alsace 92141 Clamart CEDEX - FRANCE | 75010 Paris - FRANCE Tel: 33.1.47.65.43.21 ext. 6635 | Tel: 33.1.42.05.93.15 Fax: 33.1.47.65.50.07 | Fax: 33.1.42.05.92.48 ISO 8879:1986 - SGML | ISO/IEC 10744:1992 - HyTime | ISO/DIS 10179 DSSSL Newsgroups: comp.text.sgml Date: 17 Oct 1994 15:22:06 UT From: "Wayne L. Wohler" \ Organization: IBM Information Development Strategy and Tools Message-ID: <37u4quINNi50@afshub.boulder.ibm.com> References: <37htslINNogp@afshub.boulder.ibm.com> <19941013T181021Z.enag@ifi.uio.no> Subject: Re: function character parsing [Erik Naggum] The MSxCHAR function characters are, as you say, not markup according to the definition in 4.183 so one cannot argue they should not be recogized under the provisions you cite. While not a conformance argument, these function chararacters are not useful when they do not work within PIs and comments since authors will certainly want to enter comments in their own langauge and may need to enter characters from their alphabet in PIs. | 9.7 Markup Suppression | | An /MSOCHAR/ suppresses recognition of markup until an /MSICHAR/ or | entity end occurs. An /MSSCHAR/ does so for the next character in the | same entity (if any). | | NOTE -- An /MSOCHAR/ occurring in |character data| or other delimited | text will therefore suppress recognition of the closing delimiter. An | /MSSCHAR/ could do so if it preceded the delimiter. I would feel better about this citation if there were a definition of delimited text in the standard, which there is not. Given the definition of delimiter characters in 4.86, I think it is fair to say comments and PIs fall into the classification of 'delimited text'. Thanks for the post, Erik. I have received one note offline which agrees that the interpretation you expressed is correct. -- Wayne L. Wohler Internet: wohler@vnet.ibm.com Dept G82/025Z IBMMAIL: USIB29WX@IBMMAIL Information Development Strategy and Tools Phone: +1 303 924 5943 IBM Corporation PO Box 1900 Boulder, Colorado 80301-9191 Newsgroups: comp.text.sgml Date: 17 Oct 1994 16:15:11 UT From: "Wayne L. Wohler" \ Organization: IBM Information Development Strategy and Tools Message-ID: <37u7ufINNi50@afshub.boulder.ibm.com> References: \ Subject: Re: Abandon ASCII for EBCDIC now! (Was Re: multilingual HTML, SGML) [Rick Jelliffe] | If we are going to demand that all these foreigners who have mad | character sets all adopt a unified character system, lets first adopt | EBCDIC instead of ASCII ourselves. Sounds like a good idea to me. :-) (check the signature line if you don't get the joke). Seriously, this isn't about inflicting needless pain, its about finding a way out of the morass of character encodings we have today, in both EBCDIC and ASCII; both of which are largely meaningless terms unless you use only lower case, upper case and a few punctuation characters. However we address this problem (and it is a problem that is growing as the world- wide networks are used more), it will take a LONG TIME to move to the solution. We didn't build the current tottering ediface in a day. Prudent application developers will begin to develop their code independent of the encoding of the data it processes, if they aren't already. This will make it easier to concurrently support the function on today's systems as well as systems developed in the future which use more modern character encodings. That way, you can take your pick, EBCDIC, ASCII, ISO 10646 or roll-your-own. SGML systems have a very natural place to place this function, its called the entity manager. -- Wayne L. Wohler Internet: wohler@vnet.ibm.com Dept G82/025Z IBMMAIL: USIB29WX@IBMMAIL Information Development Strategy and Tools Phone: +1 303 924 5943 IBM Corporation PO Box 1900 Boulder, Colorado 80301-9191 Newsgroups: comp.text.sgml Date: 17 Oct 1994 16:23:53 UT From: "Michael A. Nelson" \ Organization: MSFC Message-ID: \ Subject: SGML Fan want formatting info I've been watching the progress of SGML for several years now and I continue to be surprised by the diversity of its application. I work Management Information Systems here at Marshall Space Flight Center (MSFC) for NASA and we are involved in a lot of configuration management of information (mainly documents). Our procedures for managing the data products is heavily rooted in the ultimate paper format of the data. My question is -- is there a standard approach to formatting SGML tagged information such that I can always gaurantee that everyone working with it sees the same thing? I understand there is something called a Format Output Specification Instance (FOSI). Are FOSI's application unique or is there some kind of a standard that all FOSI's are based upon either formal or de facto? With the limited time that I've had to research this I haven't found one. Please post your responses here in this newsgroup. Thanks -- # Michael A. Nelson (MAN) ... on the moon. # # "Everything'll be all right" said the elephant as she danced # among the chickens. Newsgroups: comp.text.sgml Date: 17 Oct 1994 18:32:34 UT From: "Steven J. DeRose" \ Organization: EBT, Inc. Message-ID: \ References: <53@direct.iaf.nl> <1994Oct14.184628.21488@tin.monsanto.com> Subject: Re: Why SGML and NOT Acrobat [Joen Finkle] | Personally, I'd like to see a hybrid file format that can serve both | masters: | | 1) A page-turner application that can display the full appearance of | the original paper document (that regulations still require) | | and | | 2) A structured document editor that allows me to search and extract | the portions relevant to me I too would like to have the Holy Grail. But this combination is hard*. *No one has yet proposed a workable approach. I would say that if you have *to choose, you're far better off choosing the high ground and attempting *to force-fit formatting into an SGML structure, than to try to fit the *richer descriptive markup structure of SGML into a format file. One key problem is that a page-turner has to support whatever flashy new layout someone dreams up -- flip through the glossiest magazine you have and look at the layout of the pages. If you want your point (1) you have to support that, and you'd better update your format every time PageMaker(tm) adds a new capability. But the biggest problem is that the relationship between (1) and (2) is incredibly complex. It's not just plopping in comments that contain SGML tags (though that is often suggested). Among the simpler problems: a) How do you express the relationship between a footnote (better yet, a page-spanning footnote) and its reference point? b) How do you represent the connection of a story fragment that got moved miles away because of newspaper page constraints ("see 'foo' on colum,n 3, p. 2b"). c) Speaking of newspapers, why would I *want* to have an 18*24 inch page image on my 8*10 inch screen? Do I clip everything at the screen edges, or do I shrink the 7-point type down to 3-point and hope i can read it on a screen with 80dpi resolution? d) What do you do about text and other objects that were generated by the formatting process, or worse yet, that were deleted by it? i) page numbers, headers/footers, automatic list numbers, etc. ii) the 'novice' info when the previous reader was expert, and vice versa. e) What do you do when the font you wanted isn't there? You can either try modifying an existing font to sort of fit, or give up. And if there's a character set difference, you're completely out of luck (which is why page-flipper products store a full copy of any non-Latin1 font in every document file (note that all Mac fonts are non-Latin1!). If you really try to work an example of a substantial document, you'll discover these and scores more problems that must be solved. Pull down a PostScript file from the net and print it out as a PostScript file (not as formatted pages) -- last time i did this i discovered there was not one intACT word of the printed text there, it was all programs to place single letters just so. My sympathies in advance to anyone who tries to combine that with hierarchical descriptive markup structures. | Displaying an SGML doc on the fly is slow, especially when you want to | see something in the middle. Having just a page-turner limits the | content-based retrieval. A cross-breed is what I'd like to see. This, I'm happy to say, need not be correct. Try DynaText(tm). Try it on a 100-MB document. Open it and drag the scrollbar to the middle or end. In my SGML course at HyperText '93 I showed the same document in a whole range of formats from bitmap through RTF, troff, and on to SGML. The closer you get to SGML, the fewer bytes you need (even allowing for compression), and, often, the less processing you need for typical applications. That makes for speed. Now, given all that, I will say there is one method that has been shown to work reasonably well: store 2 completely separate representations, and make them point them at each other. For example, plop markers into the SGML that show where each formatted page starts, and keep the formatted pages around as bitmaps or something. While this does *not* solve the problems i noted, it at least gives a simple, clean representation for the *uncomplicated* cases. The TEI has worked hard on such problems, perhaps because it encompasses experts in both ends of the process: paleographers who care about the exact slant angle of the third "t" in each line of manuscript, and linguists who want to do vocabulary statistics on abstracts versus full-text, as well as a broad spectrum of scholars, implementors, and others between. As is becoming well known, excellent thought and practical solutions to serious markup problems are to be found in the TEI guidelines. -- Steven J. DeRose Sr. System Architect Electronic Book Technologies, inc. Newsgroups: comp.text.sgml Date: 17 Oct 1994 19:33:39 UT From: "Steven J. DeRose" \ Organization: EBT, Inc. Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> \ \ \ Subject: Re: Storage Models for SGML data [Peter Flynn] | So if I'm browsing a text and I find a phrase I want to cite in my | thesis, I can call TEIgen to provide me with the chapter and verse? In effect; what it will actually give you is a P1 extended locator that starts with the nearest ancestor that has an ID, and then walks down by child numbers to the node you selected. For those who know HyTime, this amounts to a location chain consisting of a nameloc and treeloc (except that you have the option of counting children by GI if you want, so that SECTION 2 is #2, not #3 because of the chapter title, etc.). If you want an ugly but trivial way to view these values, copy your fulltext stylesheet and modify one or more style definitions to put the locator value in before the content, maybe in brackets like here: \