News on Quantum Cryptography\

or similar; or (b) that editors etc can activate on-screen section numbering deduced from \

or similar elements, what I mean is support for Canonical Reference. If I am in the midst of an instance, having got there via, say, a search routine which has returned the exact occurrence of "foo" that I was looking for, I now want to know where I am: I want the database system to return not just "`foo' found at offset 123456" but something like "`foo' found in \ #14 in \ #5 in \ #8 in \ #3", according to some rules I specified before the search saying that for all hits I want the structure returned using such-and-such a hierarchy. The TEI proposals (P3) provide an interesting mechanism for doing this in reverse: you can specify \

-something and \

-something else and the hooks are there for software authors to use to allow this kind of navigation. But from my reading of it, it is directed at retrieval where you already know the Canonical Reference in advance, and just want to look it up (eg `find me the text of 1 Kings 21:21') rather than the search-oriented lookup, where you find the word or phrase and want to derive the pointers. If I've misunderstood the TEI's take on this, doubtless Lou or Mike will put me right :-) but I still maintain that the need is there for search engines to be able to return Canonical References. Without it, support for hierarchical references is a one-way street, as you have rightly spotted. | 3) An ability to maintain bi-directional links. (Suppose the | legislature inserts a paragraph, shifting all of the remaining | paragraphs down. How do I keep the pointers into the relocated | paragraphs valid?) This is much easier. Use referential mnemonics, not hard-wired locations. [La]TeX has been doing this for over a decade, HTML for a few years. Now that some of the mechanisms for using hypertextish linking between and within files are becoming better-established, it should be relatively easy to decide on a system to allow database maintainers to insert these links. Keeping the referential integrity is another problem: it's easy to point from parent to increasing numbers of children, but much harder to check back when one of the parents dies. A relational database, as you suggest, would handle the problem, but most of these don't scale, or do so only at much increased cost, both in terms of software and of the hardware needed to support it. My, it's a lovely sunny morning outside...:-) ///Peter Newsgroups: comp.text.sgml Date: 08 Oct 1994 14:49:43 UT From: Peter Flynn \ Organization: CERN European Lab for Particle Physics Message-ID: \ References: \ <36s4r0$p0c@yucca.ossi.com> <1994Oct8.044159.19125@phent.UUCP> Subject: Re: What DTD to use for Man-Pages? [Ralph Ferris] | Manual page markup is thoroughly covered in the DocBook DTD, which has | become the de facto standard for software technical documentation. | DocBook was originally developed by HaL Computer Systems and O'Reilly & | Associates (who hold the copyright); anyone can use it freely provided | the copyright notice is included. DocBook's development is under the | control of the Davenport group. I suggest getting on their alias to | keep up with developments. You can get information on how to do so by | logging into ftp.ora.com via anonymous ftp and following the | instructions in ./pub/davenport/toget.docbook [Kenneth R. Kress] | I retrieved the shar package and the postscript documentation (which | apparently was processed with dvips and TeX), but was disappointed | there was no mention of taking docbook through TeX to some form of | output. Is there a docbook to (La)TeX ASP file as there is with | linuxdoc? Am I missing something? I'm using Silmaril's SGML2TeX which happily lets you convert arbitrary (but non-minimised, pre-validated) SGML instances (including DocBook) into TeX files, and creates a .sty file which you then fill out with whatever hooks into plain TeX or LaTeX you need (or you can pre-code them in a config file). You can still get the \β version (which doesn't handle DOCTYPE) from ftp://www.ucc.ie/pub/sgml/sgml2tex.zip (PC version only as yet). \γ release in a few months. ///Peter Disclaimer: I wrote it. Newsgroups: comp.text.sgml Date: 08 Oct 1994 18:08:03 UT From: Jeffrey McArthur \ Organization: ATLIS Publishing Message-ID: <378h9d$e12@news.delphi.com> References: <374l6q$fuv@usenet.rpi.edu> Subject: Re: Storage Models for SGML data [Todd G. Hivnor] | Things I'd like to see in the storage system: | | 1) Revision control (when, how, and by whom was the data changed? And, | put it back the way it was) I have done this. This is relatively easy to do. We maintain a simple relational database of each document. The database keeps track of when, who, and what the person changed. We also track how long it took them to make the changes. The biggest difficulty with this approach is it makes it very difficult to use "off-the-shelf" tools like Author/Editor. A/E is a wonderful tool if you use their pre-compiled data format. But if you have to import and export data into A/E every time you use it, it "sucks dead bunnies through a straw". This is particularly nasty if you want to edit "medium sized" documents in the range of 10-20 Meg in size. Editing large documents (40 Meg or larger) is almost impossible. If your documents fall in the small category (less than 1 meg) you can probably put up with the hassles of importing and exporting with A/E. Version control has saved us a lot of time and money. Unfortunately mixing version control with most GUI based SGML tools is way too hard. That is why we end up using things like PSGML instead. This works. -- Jeffrey M\\kern-.05em\\raise.5ex\\hbox{\\b c}\\kern-.05emArthur a.k.a. Jeffrey McArthur email: j_mcarthur@bix.com phone: +1 301 210 6655 fax: +1 301 210 4999 home: +1 410 290 6935 The opinions express are mine. They do not reflect the opinions of my employer. My access to the Internet is not paid for by my employer. Newsgroups: comp.text.sgml Date: 08 Oct 1994 18:15:24 UT From: Namhoon Yoo \ Organization: University of Southern California, Los Angeles, CA Message-ID: <376njs$g9u@pollux.usc.edu> Subject: Any SGML user group in Southern California? Hello, I need the contact point of the subject group. Any pointer would be appreciated. Thanks in advance. -- Regards - NamHoon YoO [nyoo@usc.edu] Newsgroups: comp.text.sgml Date: 08 Oct 1994 21:33:23 UT From: Steven Lindberg \ Organization: The World Public Access UNIX, Brookline, MA Message-ID: \ Keywords: Job posting Summary: Position at HarperCollins Publishers NYC Subject: Job posting HarperCollins College Publishers is looking for someone to assist with SGML and Quark production support on a consultant or freelance basis. The main task is to assist in-house staff with parsing and validation of SGML files and with the creation of search-and-replace routines that add XPress tags to the parsed files. Depending upon the individual's proclivities, the work might also involve some desktop production. The ideal candidate would have the following qualifications: - previous support or training experience - working knowledge of SGML - hands-on production experience with QuarkXPress and familiarity with XPress tags - in-depth knowledge of traditional typesetting - experience with programming or macro creation - proficiency on both Mac and PC Interested parties should contact Tim Armstrong, ETM Manager, HarperCollins College Publishers, at +1 212 207 7289 or fax a resume to him at +1 212 207 7703. Newsgroups: comp.text.sgml Date: 08 Oct 1994 23:13:55 UT From: Gary Houston \ Organization: Actrix Information Exchange Message-ID: \ References: \ <19941005T224732Z.erik@naggum.no> Subject: Re: multilingual HTML, SGML documents [Erik Naggum] | the connection between the SGML declaration and the characters actually | found in the entities that are read is already significantly weakened | by the fact that the entity manager must convert line endings into | record boundaries. all computer languages that wish to be used outside | of a single (type of) system must cater to the diversity of line | endings, and this SGML does well in principle, but we still have a | vestige of "codes" for the RS and RE "characters" when they really are | events like the Entity end "signal". this is very unfortunate, and | leads to a number of other minor problems. Yes, really they only need to say something like "an entity may be divided into lines according to the conventions of the computer system" and then talk about the semantics of empty lines etc., and use a different notation for short references (assuming these can not just be replaced with editing macros...). | the main problem is this: if the entity manager is free to translate a | small set of codes to other codes, and the SGML parser is only | concerned with what it receives from the entity manager, then the | entity manager might as well translate everything into a uniform code | that the SGML parser would be happy with. the SGML parser is only | concerned with the codes that means something for the parsing process, | anyway, and the application gets whatever the SGML parser considers to | be "not markup". it should be obvious that this scheme leaves a lot to | be desired in terms of identifying the codings involved. we note that | ESIS does not contain any provisions for identification of character | sets. again, ESIS is _not_ a good idea outside of conformance testing, | for which it was defined. | | to solve this problem, it is obvious that we have to sever the link | between the SGML declaration and the actual characters in the files (or | whatever). that is, each file (etc) read must identify its character | sets somehow, and the entity manager must then be responsible for | translation of this source character set into whatever the SGML parser | will be happy with. at this stage, it is inconsequential how the | coding at the file level looks, as long as the SGML parser is happy. | the result is, however, that the SGML declaration's character set stuff | is inconsequential, as well. In my ideal case I would use the same character set for every text file. There could at least be a default (as in the locale of the computer system) for a file for which a character set is not explicitly identified. It also seems more elegant if the identification of the character set can be done outside the file itself (imagine an operating system which provided a character set attribute for each file, or a database of some kind.) However I would believe you if you said this was not practical at present. Also when exchanging files between systems it may be desirable to use character set mechanisms provided by the transport layer and remove any character set declarations from files. [Sorry if my postings seem out of order: my news feed has random delays of 1 to 4 days or more for incoming articles.] Newsgroups: comp.periphs.printers,comp.text,comp.text.desktop,comp.text.sgml Date: 09 Oct 1994 16:42:37 UT From: Lars Bruzelius \ Organization: Upsala University Computer Center, SE Message-ID: <3796gf$12f4@caesar.udac.se> Subject: Correction: 15th Annual Xplor Electronic Document Conference & Exhibit Oops, I goofed. Dyslexica strikes again. The domain was missing in the site part of the URL. As a compensation to those who tried I can offer you a couple of pages on Xplor International; Non-impact Printers; and Maritime History if you follow the links from my home page at: http://pc-78-120.udac.se:8001/ The 15th Annual Xplor Electronic Document Conference and Exhibit will take place November 6-11, 1994, at the Phoenix Civic Plaza, Phoenix, AZ. For more information take at look at the WWW-server at: http://pc-78-120.udac.se:8001/www/Xplor/15th_Annual_Conference/Conference.html Most of the 64 pp conference announcement have been made available in hyper text format. Please note that this is not an official Xplor WWW-server. For additional information please contact: Xplor International, 24238 Hawthorne Blvd., Torrance, CA 90505-650, USA. Telephone: 1 800 669 7567; +1 310 373 3633; +31 2979 41 136 (European Office) Telefax: +1 310 375 4340 Internet: info@xplor.com Please say that you found it on the Web! -- Lars Bruzelius Uppsala University Computer Center Box 174, S-751 04 Uppsala, Sweden. Telephone: +46 (0)18 187731 Bitnet: uddri@seudac21 Telefax: +46 (0)18 516600 Internet: Lars_Bruzelius@udac.uu.se Newsgroups: comp.text.sgml Date: 10 Oct 1994 03:45:50 UT From: "W. Eliot Kimber" \ Organization: Passage Systems, Inc. Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> \ Subject: Re: Storage Models for SGML data [Todd G. Hivnor] | I'm wondering what's going on with storage models for hypertext SGML | data. I'm looking for a way to store large quantities of slowly | evolving SGML files. Specifically, the data in question is government | regulations, but the problem is probably more general. | | 3) An ability to maintain bi-directional links. (Suppose the | legislature inserts a paragraph, shifting all of the remaining | paragraphs down. How do I keep the pointers into the relocated | paragraphs valid?) [Peter Flynn] | This is much easier. Use referential mnemonics, not hard-wired | locations. [La]TeX has been doing this for over a decade, HTML for a | few years. I'm not sure what Peter means by "referential mnemonics". Do you mean ID references as opposed to something like structural position or byte offset within a given file? If so, then this has been part of SGML from the beginning. If, instead, you mean locations based on properties of objects other than their unique names, then HyTime's location address methods do this, specifically the ability to use queries that refer to the values of properties of objects (e.g., I want to find the CommandRef element whose CommandName sub-element exactly contains the value "fubar" and whose revision attribute value is greater than "1.1"). It is, of course, up to a given document management or retrieval system to provide support for the query-based addressing for which HyTime defines the syntax, but many systems already have this facility and it's merely a matter of mapping the HyTime syntax onto the functionality of a given system. And tools such as TechnoTeacher's MarkMinder/HyMinder product provide the HyTime-specific functions you'd need to do such an integration. However, you can't fix the "link maintenance" problem just by telling people to change their addressing method, because they often either don't have a choice or have compelling reasons for using specific types of addresses. This means that you need a more complete link management system that can track and manage links regardless of the addressing used or the ultimate objects located as anchors. This is not trivial, but people have been and are working on the problem. The difference is that before there was no interchange form that would allow the interchange of link information and linked documents among different management systems, but with HyTime there is. This means that document and link management systems need no longer be solely monolithic, but can potentially communicate at the data level (as opposed to the API or processing level) with other systems. Complementary specifications such as the Shamrock document management API promise some hope of program-level interchange as well. Ideally, I should be able to create interlinked HyTime documents, bring them into document/link manager X, modify them over time, export them (as HyTime documents, of course), import them into document/link manager Y and be able to continue the exact same work with the same level of control and understanding of the links in Y as I had in X without ever changing the source form of the documents or the links. The import and export should be very nearly transparent with no apparent change to the documents in Y from X. Systems X and Y may have very different ways of optimizing their internal representations of the links and addresses used with them, but to the world outside of X and Y, the data in the systems appears to be identical (where identity in SGML can only mean "have the same character stream"). -- \

W. Eliot Kimber (kimber@passage.com) Systems Analyst and HyTime Consultant Passage Systems, Inc., 9971 Quail Blvd., Suite 903, Austin TX 78758 +1 512 339 1400 465 Fairchild Dr., Suite 201, Mountain View, CA 94043, +1 415 390 0911 \

Newsgroups: comp.text.sgml Date: 10 Oct 1994 12:52:34 UT From: Fred Fenimore \ Organization: The News & Observer Message-ID: \ References: <372m54$37q@hermes.achilles.net> Subject: Re: word processor to SGML conversion [Peter Bomberg] | I would like to ask if there is a good listing of products in this | area? If so where and if not could anyone that has any info please | email me. I am specificaly interested in WP 5.2 and WP 6.0, Frame and | Word to SGML converters/translators. There is a pretty good jump page on the Web at http://info.cern.ch/hypertext/WWW/Tools/Filters.html Of course all of this info is related to getting from the various standard word processors to HTML so that might not be what you are looking for. Hope this helps, Fred -- --------------------------------------------------- | Fred Fenimore | The News & Observer | | | 215 S. McDowell Street | | fredf@nando.net | Raleigh, NC 27602 | | | (919) 829-4769 | --------------------------------------------------- | I speak only for myself and then only rarely | --------------------------------------------------- Newsgroups: comp.text.sgml Date: 10 Oct 1994 15:17:18 UT From: "Claude L. Bullard" \ Message-ID: <9410101517.AA44549@source.asset.com> References: \ <1994Sep26.233841.22334@midway.uchicago.edu> \ <1994Oct6.142331.11581@midway.uchicago.edu> <9410071859.AA49368@source.asset.com> Subject: Re: Multilingual HTML, SGML documents? I really enjoyed your comments. They have perspective and a healthy sense of humor. I apologize for the length of *this_one*, but I would rather reply to all of the comments in one post. |-) [Tim Bray] | Wrong. Wrong. Wrong. Wrong. The scope of a commercial application | is *never* defined in advance. If its designers have any brains, they | will assume that new and unforeseen demands will be placed upon it, and | new requirements will arise for the use of its data... The scope of | *military* applications is often attempted to be specified in advance. | That's why they so often fail and always cost more than is reasonable, | becuse the scope is guaranteed to be wrong. I may have overstated that, Tim. However, it is my experience in large design groups that the failure to *scope* the design is the single largest contributor to underschedule, underbudget and underdeliver applications. IMO, this is a fundamental of applying view dimensions to the analysis of complex systems. On the famous other hand, I think that this also has a lot to do with the overall maturity of the design group itself. However, a well-written and thought through design spec is a requirement for many parts of a project especially those organized around Work Breakdown Structures, etc. It's hard to bill against good intentions. Oddly enough, the design spec document system was created to keep costs down. Talk about *good intentions*. |-) Military applications are, in the perceptions of commercial design, overbuilt. But, this is a result of the extraordinary environments in which these applications are intended to function. Only the military requires that a PC operate after being submerged for four hours in salt water or that an LED screen be perfectly readable in the mid day sun of a desert. Does this sort of thing get overdone? Yes, but it was this sort of overdoing that enabled the Patriot to perform as an ABM when it was only designed to be a SAM. No one foresaw that requirement as far as I know. But, as you point out, "if its designers have any brains, they will assume that new and unforeseen demands will be placed upon it". They do it with a broad specification negotiated with the customer who must figure both cost and flexibility into the design. A lot of good ideas go to the O-file because of cost. [Peter Flynn] | Who was it authorised the (doubtless apochryphal) mass program of | stable- building in the last century "because the number of horses | required by the US Army will increase X-fold in the next 100 years" or | some such? Probably the same ilk who approved that long underground tunnel in Texas for creating a *supercollider* to smash the final subatomic particle before the rest of the world could get ahead in the race to see who will be first to make the most of nothing. It always depends on who is *smashing* what against the brick wall. Often if designers are left to their own devices, nothing from something is what one gets for their dollars. BTW, I am a designer, not management. I didn't want to gray prematurely. |-) [Richard L. Goerwitz] | I would only add that the Web is not, by nature, a traditional, | localized, commercial product. It aspires to much more than | this....Part of this vision is that of a globalized information | resource. If such a resource were confined to documents written in | Latin script, my feeling is that it would not be worthy of the title | "world wide". I agree on both points. But it is the aspiration that I question. Aspirations are human activities, not the actions of artificial systems; so, are these human aspirations realistic? Should the WWW builders be required to create character set support for the language of the tribes of the African interior with the "click" on the vocables? If the tribes become WWW users, one might say "certainly". But if only a few thousand users can read or understand that set, is it worth the effort? I don't have an answer for that. What I do agree with wholeheartedly is that if the WWW is to realize even a small percentage of those aspirations, cheap and easy to implement mechanisms for the character sets must be found and agreed upon. Excluding cultural groups deprives us of too much and forcing them into the Western languages and the consequent mind set impoverishes all. Yet after mulling these issues over the weekend, these are what I think will become requirements for the WWW: 1. Mechanisms (legal or automated) for registering different content types (in the SGML buzz, applications a la DTDs) should be a high priority for the IETF. SGML was designed with that in mind and I should like to be able to continue the practice. HTML is fine for most of what is intended for the Web, but if *unanticipated* applications are to be practical and cost-effective, and the global environment is to foster competition as well as evolution, then we must be allowed to use many of the same mechanisms available on the WWW with different data models and processing strategies. If we can all agree to use an efficient subset of the HyTime addressing and location models, these applications can interoperate to that degree which their designers deem necessary. 2. This will be controversial. It may be necessary for the WWW to prune its nodes just as the human brain reassigns segments of the neuronal *wiring* when they are unaccessed for some amount of time. Pruning may be a local community activity where the *local community* is defined by the laws of the country in which the node resides. 3. If one does not already exist, there should be an international mechanism subject to the *local community* whereby one is *licensed* to become a WebSite. This should not be overly restrictive and can be comparable to existing regulations for becoming a licensed ham (shortwave) radio operator, but it can help to prevent abuse of the network and its users. I don't believe such requirements would interfere with the goals of a global information resource. They would reinforce the primacy of international regulation of a global resource for the betterment of all who service and are served by it. IMHO, it has been the ability of the international community to do just this that has enabled us to awaken from the nightmare that I cited in my last post. Perhaps then, we can hold on to both the dream and the bright morning. Thank you for your patience. Len Bullard Newsgroups: comp.text.sgml Date: 10 Oct 1994 15:28:10 UT From: Richard Wal Matzen \ Organization: Oklahoma State University, Computer Science Message-ID: \ Subject: Re: Anyone have any info about WORD's recent SGML release? Regarding Microsoft(R) SGML Author for Word: [Mark Walter] | In simple terms, it maps Word document template styles to SGML elements | and attributes, but, in fact, it handles quite a few of the messy | instances where these two document representations do not map on a | one-to-one basis. It covers everything you can do in Word, including | math and tables. This seems to imply that the formatting info associated with the styles is output to the SGML file. However, in all of the Microsoft press releases I havent seen any references to style info being output. So my questions are: Does anyone know if/how this product handles the exchange of style (formatting) info? Is any style info output? If so, how complete is it and in what form is it: attributes? similar to Rainbow DTD? Also, same question for WordPerfect's SGML product. -- Rick Matzen \ Newsgroups: comp.text.sgml,comp.os.ms-windows.apps.word-proc,bit.mailserv.word-pc Date: 10 Oct 1994 17:25:43 UT From: Howard Kaikow \ Organization: MV Communications, Inc. Message-ID: \ References: \ Subject: Re: Anyone have any info about WORD's recent SGML release? To clarify some earlier post of mine. My concern is that Word has no way to distinguish certain commonly needed elements in a style. My favorite example is its inability to use dictionary style definitions for a numbered clause and not lose the capability of automagically updating the table of contents and using the Insert Cross Reference facility to refer to the clause's title. Many documents would have a need for three tags, one for dictionary style definition; another for the usual numbered clause, followed by a separate unnumbered paragraph; the third for a numbered clause without a title. Word cannot do the first and third, without losing the capability of updating the table of contents and using cross references properly. I pointed out these shortcomings to Microsoft in, as I recall, January 1993, perhaps Word 7.0 will have these and other problems fixed, one can only hope. Newsgroups: comp.text.sgml Date: 10 Oct 1994 19:03:05 UT From: Kevin Brennan \ Organization: Honeywell ATS Message-ID: <37c359$j04@bmw.hwcae.az.Honeywell.COM> References: <374cqn$3l8@bmw.hwcae.az.Honeywell.COM> <3762vk$irc@deep.rsoft.bc.ca> Subject: Re: SGML "Viewers" [Kevin Brennan] | We're particularily interested in AIr Transport Association (ATA) | SGML compliant viewers with Temporary Revision capability. [Tim Bray] | What's Temporary Revision capability in a viewer? Inquiring minds | want to know. Temporary Revision capability allows the using orginazation to provide revisions to the text contained on the CDROM without cutting a new CDROM. For example, an airline using maintenance manuals on CDROM may receive revisions to the manual *between* issuances of new CDROMS. Jouve's software puts a revision file on the local hard drive and replaces or annotates text appropriately. In the case of a networked resource, there can be a shared file on the network. There of course, is a seperate TR software (optional at some cost) that the Maintenance Publication department would have to acquire. K :-) Newsgroups: comp.text.sgml Date: 10 Oct 1994 20:57:10 UT From: Cheryl Conway \ Message-ID: <9409117818.AA781866386@njcorp.akbs.com> Subject: Change of employer & contact info for Cheryl (Ventura) Conwa I am posting this by the way of introducing myself to comp.text.sgml... and also in the hopes that my former colleagues will see this, and update their information for me, so I once again get onto all of the SGML- related mailing lists, etc. I am Cheryl (Ventura) Conway and my work, for the past 10 years has been in: - on-line (formerly aircraft) maintenance manuals, - SGML and electronic document processing systems, - conversion of paper - or "legacy data" - document libraries to on-line, SGML-encoded, hypertext form, - and integrating them with AI diagnostic systems or data acquisition systems (for device-under-test) My FORMER position, employer, and address, phone, etc. were: Cheryl Ventura Conway, Staff Engineer AlliedSignal Guidance & Control Systems Division mailstop G/11 or M/11 or F/12, Teterboro, NJ 07608 +1 201 393 3842 I have recently changed jobs to a better position. My NEW information is: Cheryl Conway, Consultant (no longer using maiden name) Advantage KBS 1 Ethel Road, Suite 106B Edison, NJ 08817 cconway@akbs.com, Internet +1 908 287 6494 x 253, phone +1 908 287 3193, FAX I am continuing to work in the same areas, only now applied to maintaining communications (e.g., telephone) devices and services, both hardware and software, rather than (applied to) primarily hardware aircraft maintenance. Advantage KBS markets a particularly effective on-line help tool based on an integration of SGML-encoded on-line hypertext manuals, integrated with an AI diagnostic engine. If you've previously had contact information for me, or had me on your mailing list -- please do me the favor of updating your information for me so I can get myself re-connected to the SGML world! Regards, Cheryl Conway cconway@akbs.com Newsgroups: comp.text.sgml Date: 10 Oct 1994 21:34:54 UT From: "John Vernon" \ Organization: METASOFT Message-ID: \ References: <36p76b$c1@erinews.ericsson.se> <1994Oct5.143731.20589@ast.saic.com> Subject: Re: SGML to Postscript formatter [Bob Agnew] | FOSIs are supported by Arbortext and DataLogic. DSSSL? Can you give me any contact details for any of these two companies? I am looking for an SGML to Postscript converter. The main concern is ease of use, (e.g., FOSI defined style info is fime, but something simpler in addition would help), performance and availability across platforms. Can you point advise me where I might find such a converter? Much obliged John Vernon Newsgroups: comp.text.sgml Date: 10 Oct 1994 22:35:06 UT From: BlueSky \ Message-ID: \ Subject: SGML parsers for UNIX/public domain ?? Hi, I am intrested in finding if there are any public domain parsers that work on Unix, that checks the complines the text instance with dtd ?? Any information will be greatly appreciated Thankyou RAVI bluesky@netcom.com -- ______ BMW 325i: I love my Car: /______\\ bluesky@netcom.com FLY OUR FRIENDLY SKIES (oo=OO=oo) +1 415 634 4002 []""""""[] __|__ __|__ __|__ *---o0o---* *---o0o---* *---o0o---* | __|__ ____|____ --o--o--(_)--o--o-- -o---O--( . )--O---o- ___________ | _ _|_ _ (_)-/ \\-(_) _ /\\___/\\ _ (_)_______________________( ( . ) )_______________________(_) \\_____/ Newsgroups: comp.text.sgml Date: 10 Oct 1994 23:06:31 UT From: Peter Flynn \ Organization: University College Cork Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> \ \ Subject: Re: Storage Models for SGML data [Peter Flynn] | This is much easier. Use referential mnemonics, not hard-wired | locations. [La]TeX has been doing this for over a decade, HTML for a | few years. [W. Eliot Kimber] | I'm not sure what Peter means by "referential mnemonics". Do you mean | ID references as opposed to something like structural position or byte | offset within a given file? If so, then this has been part of SGML | from the beginning. If, instead, you mean locations based on | properties of objects other than their unique names, then HyTime's | location address methods do this, specifically the ability to use | queries that refer to the values of properties of objects (e.g., I want | to find the CommandRef element whose CommandName subelement exactly | contains the value "fubar" and whose revision attribute value is | greater than "1.1"). I mean the second: sorry, I was typing fast :-) A better-known usage might be LaTeX's way of using user-made labels as pointers, although these lack the richness of HyTime's location addressing. I'm still looking for systems which can replicate the canonical references for me tho... ///Peter -- Peter Flynn | pflynn@curia.ucc.ie | ...persuade users that spreading Computer Centre | +353 21 276871 x2609 | fonts across the page like peanut University College | +353 21 277194 (fax) | butter across hot toast is not the Cork, Ireland | Opinions are my own. | route to typographic excellence... Newsgroups: comp.text.sgml,comp.os.ms-windows.apps.word-proc,bit.mailserv.word-pc Date: 10 Oct 1994 23:15:10 UT From: Peter Flynn \ Organization: University College Cork Message-ID: \ References: \ \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Howard Kaikow] | My concern is that Word has no way to distinguish certain commonly | needed elements in a style. My favorite example is its inability to | use dictionary style definitions for a numbered clause and not lose the | capability of automagically updating the table of contents and using | the Insert Cross Reference facility to refer to the clause's title. | | Many documents would have a need for three tags, one for dictionary | style definition; another for the usual numbered clause, followed by a | separate unnumbered paragraph; the third for a numbered clause without | a title. Word cannot do the first and third, without losing the | capability of updating the table of contents and using cross references | properly. I cannot conceive of trying to do this with a visual-structure system like Word: if printed output is the principal goal, a logical-structure system like LaTeX would be the natural choice. Why make life difficult? ///Peter -- Peter Flynn | pflynn@curia.ucc.ie | ...persuade users that spreading Computer Centre | +353 21 276871 x2609 | fonts across the page like peanut University College | +353 21 277194 (fax) | butter across hot toast is not the Cork, Ireland | Opinions are my own. | route to typographic excellence... Newsgroups: comp.text.sgml,comp.os.ms-windows.apps.word-proc,bit.mailserv.word-pc Date: 11 Oct 1994 01:40:06 UT From: Howard Kaikow \ Organization: MV Communications, Inc. Message-ID: \ References: \ \ \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Peter Flynn] | I cannot conceive of trying to do this with a visual-structure system | like Word: if printed output is the principal goal, a logical-structure | system like LaTeX would be the natural choice. Why make life | difficult? Well, life happens to be difficult. I am forced to use Word to deal with many people that I have to send documents to or to share editing tasks with. Word is not the right tool to use, but it has become a de facto standard in many areas. Personally, I would much rather do everything in SGML, perhaps the environment I am in will be there some day, not yet tho. Newsgroups: comp.text.sgml Date: 11 Oct 1994 06:14:03 UT From: "W. Eliot Kimber" \ Organization: Passage Systems, Inc. Message-ID: \ References: \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Mark Walter] | In simple terms, it maps Word document template styles to SGML elements | and attributes, but, in fact, it handles quite a few of the messy | instances where these two document representations do not map on a | one-to-one basis. It covers everything you can do in Word, including | math and tables. [Richard Wal Matzen] | This seems to imply that the formatting info associated with the styles | is output to the SGML file. However, in all of the Microsoft press | releases I havent seen any references to style info being output. So | my questions are: : | Does anyone know if/how this product handles the exchange of style | (formatting) info? Is any style info output? If so, how complete is | it and in what form is it: attributes? Similar to Rainbow DTD? The above inference is incorrect. The quoted quote makes it sound like the mapping from styles to SGML is *automatically* defined, but this is not the case. Someone (you or some specialist) must define the mapping from a given set of styles to a particular SGML document type. As a rule, the style definition for an SGML document is separate from the document itself (e.g., FOSIs, DynaText style sheets, etc.) so it is generally not meaningful to talk about style exchange in this context (e.g., going from creating a structured document in Word that is exported as SGML). In other words, the purpose of Word's SGML tools is not really to allow you to export any arbitrary Word document in some sort of equivalent SGML form, but to allow you to create in Word documents that conform to a specific document type. The same is true of Word Perfect's Intellitag. However, there's nothing that prevents you from defining a Rainbow-type DTD that has the elements and/or attributes needed to capture the style information that might be in Word documents, such as paragraph-level format changes. However, it would be up to you to define both the DTD and the mappings to it from Word -- Microsoft cannot provide that out of the box because every case will be different. I saw a demo of this product at Seybold and it seemed to have some interesting features. I am hesitant to characterize it, but from what I saw it seemed to fall somewhere between Intellitag and FrameBuilder in terms of support for structured authoring and its import and export facilities. The Microsoft representatives were careful to characterize it as a first step in a continuum of structure- and SGML-based authoring tools. They have, for example, partnered with SoftQuad to provide a true SGML editor (a stripped-down version of Author/Editor) for doing post-export cleanup. The Microsoft people impressed me with their candor and their understanding of SGML and the limitations of their product. They appear to have avoided the mistakes of their predecessors and have not tried to oversell what they're offering. From what I've seen, it appears to be a tool that can fit into a larger SGML-based strategy that includes a variety of tools, SGML and non-SGML, appropriate to specific users and tasks. -- \

Newsgroups: comp.text.sgml Date: 11 Oct 1994 06:32:58 UT From: "W. Eliot Kimber" \ Organization: Passage Systems, Inc. Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> \ \ Subject: Re: Storage Models for SGML data [Peter Flynn] | I mean the second: sorry, I was typing fast :-) A better-known usage | might be LaTeX's way of using user-made labels as pointers, although | these lack the richness of HyTime's location addressing. I'm not a TeX user, so I'm not familiar with its addressing features, but I'm wondering how a "user-made" label differs from an SGML unique identifier. | I'm still looking for systems which can replicate the canonical | references for me tho... If I understood your requirement, you're specifing a query along the lines of: "find the 4th paragraph that is a descendant of the 2nd Section within the 3rd Chapter." If that's the case, then you can express it as a HyQ query certainly, as well as queries in a number of other document querying systems, such as OpenText's PAT and AIS' SGML Search. You could even define your own syntax as a new "query notation" and use it within a HyTime addressing context, e.g.: \ \ \ \ ]> \ ... \Check out this paragraph\ \\Para4 in Sect2 in Chap3\ ... \ You could define the DTD to give you more tag minimization to reduce the keystrokes somewhat. For example, you could create a specialized named location element type that can only contain a CanonicalRef element, which would allow the following minimization: \Para4 in Sect2 in Chap3\ Which might be more intuitive for some authors. It's important to remember that any form of addressing can be integrated within a HyTime framework using either name queries or notation-specific locations (which are a form of query), which means that in essence your addressing functions are limited only by the power of your retrieval system, not by some inherent limitation in SGML or HyTime. Of course this example of using a specialized query notation has some problems with interchange if support for this form of query is not wide spread, but that problem will always exist because no single solution can solve all problems. In this example, I assume that the need for this form of reference is more compelling than a need to enable query interchange. -- \

Newsgroups: comp.text.sgml,comp.os.ms-windows.apps.word-proc,bit.mailserv.word-pc Date: 11 Oct 1994 08:52:02 UT From: Peter Flynn \ Organization: University College Cork Message-ID: \ References: \ \ \ \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Howard Kaikow] | Well, life happens to be difficult. I am forced to use Word to deal | with many people that I have to send documents to or to share editing | tasks with. Word is not the right tool to use, but it has become a de | facto standard in many areas. Sure, I wasn't attaching any blame to you, just continuing bafflement at how these decisions get made in the light of all the evidence :-) | Personally, I would much rather do everything in SGML, perhaps the | environment I am in will be there some day, not yet tho. I dunno, there are a lot of jobs more suited to Word in an office environment right now, while SGML software remains relatively underused. ///Peter -- Peter Flynn | pflynn@curia.ucc.ie | ...persuade users that spreading Computer Centre | +353 21 276871 x2609 | fonts across the page like peanut University College | +353 21 277194 (fax) | butter across hot toast is not the Cork, Ireland | Opinions are my own. | route to typographic excellence... Newsgroups: comp.text.sgml Date: 11 Oct 1994 13:09:00 UT From: "Bart Souren" \ Organization: Tilburg University / The Netherlands Message-ID: <19941011140936.B.Souren@sp0294.kub.nl> Subject: Looking for SGML/HTML final project. I am a student at the Faculty of Arts of the Tilburg University in Holland. My specialisation area is "Text and Computers", which among other things deals with SGML and HTML. I am looking for a SGML/HTML project for my final paper. From november till february/march I will be busy with a teaching practice at Wolters Kluwer Publishers, which will deal with SGML. After my teaching practice I will have to do a final project, which I prefer to do abroad. Especially I prefer european countries and Australia. From my department at Tilburg University I will have to do my final project at a university or scientific institute. So if you know a university or scientific institute where it would be possible to do a final project, please let me know. Keep in touch, -- Bart Souren Bisschop Zwijsenstraat 26 5038 VB Tilburg The Netherlands Newsgroups: comp.text.sgml Date: 11 Oct 1994 13:31:07 UT From: Charles Hurley \ Organization: PANIX Public Access Internet and Unix, NYC Message-ID: <37e42r$mop@panix.com> References: <3762qj$4p6@finnegan.iol.ie> Subject: Re: YASP, does anyone have a MS-DOS copy? [Sean Mc Grath] | I have looked through the compilation scripts etc. for YASP but I do | not have access to an OS/2 box to compile up a DOS version as required. | I guess with a bit of work the REXX script could be analysed to produce | a DOS makefile for Microsoft or Borland and the code compiled directly | on a DOS based PC. If someone has already been done, I would | appreciate any information. If it has not been done, we will do it and | post details here. | Either which way, a DOS binary of YASP compiled using the existing OS/2 | hosted compilation process would be very useful. If you are willing to write the code, I will compile it on Borland C++ or DJGPP. I don't know if I can be any greater help than that. -- % Charles R. Hurley Phone: +1 212 633 3759 % Publishing Technologies Coordinator (TeX Projects) Fax: +1 212 633 3685 % Elsevier Science Inc. E-mail: churley@panix.com % 655 Avenue of the Americas, New York, N.Y. 10010-5107 Newsgroups: comp.text.sgml Date: 11 Oct 1994 21:13:29 UT From: Baohuan Zhang \ Organization: Computer Science, Liverpool University Message-ID: \ Keywords: SGML DTD Subject: How to make a SGML DTD document? Hi, I need to make a SGML DTD document. I would appreciate if anybody can give me some examples of DTD. Many Thanks, Baohuan Newsgroups: comp.text.sgml Date: 11 Oct 1994 22:58:07 UT From: Darin Duvall \ Message-ID: <9409117819.AA781919887@cc-mail.agcomed.uiuc.edu> Subject: Accessibility and access to the Internet Fellow information providers: I've been pretty good about reading all of my newsgroups lately, and I haven't read anything that would suggest that adopting PDFs as a standard would have any sort of harmful effect (see messages below). Did I miss that part? Adobe's PDFs are an OPEN FILE FORMAT. Anyone can write an application to read PDFs. Anyone can write a plug-in that reads PDFs into their existing application. A PDF could be routed to a speech synthesizer or a braille printer just as easily as any other text file, as long as you have a filter for it. [Yuri Rubinsky] | The idea that the page image -- one-time, non-reusable, non-adaptable, | etc -- should be considered a valuable archival or display form is | frightening. Sounds like Yuri should try out Acrobat before jumping to any conclusions. None of the statements above apply to PDF documents. You can copy the text and/or graphics from a PDF and paste them into any Mac or Windows application. You could even write an application that lets you modify a PDF. A PDF is not just a page image. It's the original text and graphics stored along with the formatting information that makes it look consistent across platforms/ configurations. [Mike Paciello] | ...if ADOBE has their way. Once again, may (sic) folks will be left | without true access to information, particularly on the NII. What is Mike talking about? Acrobat's technology is about *increasing* access. For the first time ever, the average person can distribute nice looking documents that incorporate hotlinks and graphics, and different views. The idea that I may one day be able to access all Federal documents in PDF form is WONDERFUL! An indexed document that lets me click on a topic an jump right to that info... Our only current alternative is SGML, and there are NO tools that let ordinary people create meaningful SGML documents. SGML certainly has its place, but not as the defacto standard for document exchange. SGML is too complicated and limited for such broad use. Let's not let a few SGML die-hards spoil an opportunity for the rest of us. Mike, the Acrobat Readers are all FREE! I downloaded the Mac, DOS, and Windows versions today! They are on the Net and every major on-line service. How is that limiting access? Mosaic documents can call up PDF files. A PDF can have a link to a Mosaic file. How is that limiting access? If there is an opportunity for Adobe to make some money while we citizens get some really cool information technology, I'm all for it. If there are legitimate concerns about information access, I'm all ears. You can convince me to change my viewpoint with facts, but so far I'm seeing a whole lot of hollering and hype, and I don't see the fire... --------------------------------------------------------------------------- [Mike Paciello] Hi Folks! Please distribute the attached mail on a many listservs as possible. This is a MAJOR concern if ADOBE has their way. Once again, may folks will be left without true access to information, particularly on the NII. ICADD is working hard to block this, as well as the SGML Consortium. We hope to get your support. Please note that I will be at the Mosaic '94 conference next week and at the Internet '94 Conference in December, to ensure that we have an equal voice in accessibilty on the NII. Thanks for your support, Regards, Mike Paciello Program Manager Vision Impaired Information Services Digital Equipment Corporation [Yuri Rubinsky] I understand that the US National Institute of Standards and Technology, supported by dollars from the Dept of Defense and Adobe has done a study suggesting that Adobe's proprietary PDF format be made a US Federal Information Processing Standard for electronic viewing. My feeling is that this would throw back the cause of accessible documents immeasurably. The idea that the page image -- one-time, non-reusable, non-adaptable, etc -- should be considered a valuable archival or display form is frightening. I have heard all this just today, and very third hand, but I understand that Oct 17th is the last day for opposing viewpoints to be heard. I encourage those of you who are American, and taxpayers, to make your feelings known. The SGML Open Consortium will certainly do what it can in this area. If you have feelings in this area, or want to know what's going on, contact Mike Rubenfeld at NIST, +1 301 975 3064. Please keep the rest of this list informed too. Yuri Rubinsky --------------------------------------------------------------------------- Sincerely, -- Darin Duvall, Electronic Publishing Specialist University of Illinois _______________________ College of Agriculture | voice: +1 217 244 2832 Information Services | fax: +1 217 244 7503 47 Mumford Hall | 1301 W. Gregory Drive | e-mail: duvall@uiuc.edu Urbana, IL 61801 | Newsgroups: comp.text.sgml Date: 11 Oct 1994 23:01:00 UT From: Peter Kerr \ Organization: School of Music University of Auckland Message-ID: \ References: \ <1994Sep26.233841.22334@midway.uchicago.edu> \ <1994Oct6.142331.11581@midway.uchicago.edu> <9410071859.AA49368@source.asset.com> <9410101517.AA44549@source.asset.com> Subject: Re: Multilingual HTML, SGML documents? Whatever happens to the Web we need guys like Claude: Article <9410101517.AA44549@source.asset.com>, "Claude L. Bullard" \ out there batting for common sense and human decency! -- Peter Kerr bodger School of Music chandler University of Auckland neo-Luddite Newsgroups: comp.text.sgml Date: 12 Oct 1994 10:33:07 UT From: Arthur Ogawa \ Organization: TeX Consultants Message-ID: \ References: <9409117819.AA781919887@cc-mail.agcomed.uiuc.edu> Subject: Re: Accessibility and access to the Internet [Darin Duvall] | I've been pretty good about reading all of my newsgroups lately, and I | haven't read anything that would suggest that adopting PDFs as a | standard would have any sort of harmful effect (see messages below). | Did I miss that part? First, many thanks to Darin for posting this note; I hadn't seen anything else to date. I intend to contact the NIST with my concerns, and I hope others in this newsgroup do likewise, regardless of which side of the question you may fall on. The crucial issue here, as I see it, is the nature of the document. Adobe's PDF (of which I am an enthusiastic supporter) is a fine *page description language*, which is not to say that it qualifies as a "primary document". Rather, it is a file format for a derived document, which can be viewed, printed, browsed, and possibly even handed over to a system that allows it to be "read out" for, say, a print-impaired person. The ideal document management system would be one in which a primary document could be served up to a subscriber in whatever form is most convenient for that person. However, PDF does not fulfill the criteria for a primary document. I rather picture the PDF being generated on the fly from an SGML document. That said, it is possible that Adobe will someday achieve something in PDF that will make it qualify as a primary document. In my mind that quality is that one can recover the original, validatable SGML document from the PDF document via some sort of filter. And I am encouraged to look for that outcome, because PDF has been supported and further developed by Adobe since its introduction 2 years ago. It is now possible (in the revised Acrobat 2.0) to do text searches in a document with considerably more power than before. And it has always been possible to prepare the PDF document with structured comments which could in principle represent the SGML markup in the original. So in some limited sense, PDF is usable as a primary document, but only to the extent that it fully supports this "SGML transparency". So, what I am saying is that the technical issue around PDF is whether it is capable of the expressive power needed to allow this recovery of the original document. And sad to say, presently it does not provide full capabilities. I have heard, by the way, of the possibility of some Avalanche-provided SGML capability for PDF. I have not heard enough about it to get very enthusiastic, though. | Adobe's PDFs are an OPEN FILE FORMAT. Anyone can write an application | to read PDFs. Anyone can write a plug-in that reads PDFs into their | existing application. A PDF could be routed to a speech synthesizer or | a braille printer just as easily as any other text file, as long as you | have a filter for it. Sure PDF is great, but I think what many of us are concerned about is that it is insufficient to present a formatted document to a print-impaired reader. Those who have worked in the field are striving for a way to deliver much richer information to such a person. In effect she will be able to browse the document structurally if she so chooses. We who can easily read print do take a lot for granted. We can describe the structure of a document just by glancing at its visual presentation. Such is the miracle of the typesetter's art and our own visual cortex. But try working with a document that is poorly formatted, or in which the structure is obfuscated in some way or other -- and you'll get a taste of what a task it is for a print-impaired reader to process any document. [Yuri Rubinsky] | The idea that the page image -- one-time, non-reusable, non-adaptable, | etc -- should be considered a valuable archival or display form is | frightening. [Darin Duvall] | Sounds like Yuri should try out Acrobat before jumping to any | conclusions. None of the statements above apply to PDF documents. You | can copy the text and/or graphics from a PDF and paste them into any | Mac or Windows application. You could even write an application that | lets you modify a PDF. A PDF is not just a page image. It's the | original text and graphics stored along with the formatting information | that makes it look consistent across platforms/ configurations. Yuri's point is tersely stated, and perhaps not compelling to some. (And I'll wager that he *has* tried Acrobat out.) Yes, you can mung up an Acrobat document. Yes, it is accessible (being mostly ASCII). But I don't agree that Acrobat is more than a page image (well...page *images* ;-). A bitmap it's not, true. But inside the PDF, you'll look in vain for a statement analogous to "beginning of chapter" or "book title starts here" or the like. After examining the PDF standard, I conclude that inferring structure out of a PDF is presently no easier than doing so out of, say, a MicroSoft Word document. And that failing is the big problem. [Mike Paciello] | ...if ADOBE has their way. Once again, may (sic) folks will be left | without true access to information, particularly on the NII. [Durin Duvall] | What is Mike talking about? Acrobat's technology is about *increasing* | access. Granted, Adobe intends to increase access -- to sighted readers. It's the others that one is concerned with. I think that those in this newsgroup can appreciate the gap between Adobe's current technology and what's necessary for high-utility access to a document. | For the first time ever, the average person can distribute nice looking | documents that incorporate hotlinks and graphics, and different views. | The idea that I may one day be able to access all Federal documents in | PDF form is WONDERFUL! An indexed document that lets me click on a | topic an jump right to that info... I too am turned on by the concept of widespread access to Federal documents. But I don't see Acrobat as an adequate form for representing that document in its primary form. I think that if someone wants to view such a document in the form of a PDF, then that person should be able to do so. But let's keep the richness of the document's structure in the primary document, so it can be utilized by those capable of doing so. I would favor the original document in SGML form, with PDF delivery an option. | Our only current alternative is SGML, and there are NO tools that let | ordinary people create meaningful SGML documents. SGML certainly has | its place, but not as the defacto standard for document exchange. SGML | is too complicated and limited for such broad use. Let's not let a few | SGML die-hards spoil an opportunity for the rest of us. I don't quite see things this way. First, SGML is not the only alternative to PDF. There are plenty of further alternatives, and none measures up the the standard of information richness that SGML provides. Second, the issue in certain areas might be whether ordinary people can create meaningful documents, but here we are talking about the NIST and the Federal government. It is they who will be creating these documents. So as far as ordinary people go (and lots of the unsighted are ordinary people, you must admit), it's a question of information delivery. And for some of us, PDF won't work well. Third, document exchange is not really an issue here, either. It is document *delivery* that Darel seems to be addressing. If a government document is stored in SGML form, this does not in and of itself limit access to that document. When a request for that document is received, one option may be to deliver it in its archival form, SGML, another option may be PDF. Too complicated? Maybe to someone who has to read the SGML document with an ASCII text editor. But I don't see this (time-honored, archaic) way as the way of the future. Limited? The only practical limitation of SGML on the delivery side (that is, once the SGML document exists) is availability of a dirt-cheap browser for folks who don't like to pay real money for the software they use. Back when Acrobat was first announced, Adobe took a lot of flak for having the timerity to suggest that each reader might want to pay $25 for a Viewer. Now that the price has gone to $0.00 (plus the cost of net.access ;-), Acrobat receives Darin's enthusiastic vote. Heh. (The thing that will tip the scales of the public mind in favor of SGML might be something that only has a Name For Now, Yuri.) | Mike, the Acrobat Readers are all FREE! I downloaded the Mac, DOS, and | Windows versions today! They are on the Net and every major on-line | service. How is that limiting access? I love to surf, too, but I am aware that net.access is by no means the common way of getting information. NIST may provide potentially universal access by providing documents online, but let's not kid ourselves that this amounts to anything at all like *universal* access in fact. Meanwhile, if they do chose PDF as the archival form of the document, and turn their backs on SGML, they effectively cut out a large segment of the population (who fall in a category called "the disabled"). This is the thing I don't want. I and others have been working for quite some time to increase access to information to people in precisely such a way that there are *fewer* inherent barriers. SGML per se does not erect barriers. It *can* eliminate some. By the way, Darin, you picked up Mac, DOS, and Win versions, right? Did you not feel a twinge of sympathy for the poor souls who are confined to a Unix box? Huh? | Mosaic documents can call up PDF files. A PDF can have a link to a | Mosaic file. How is that limiting access? If there is an opportunity | for Adobe to make some money while we citizens get some really cool | information technology, I'm all for it. Me, too. I'm heavily invested in the continuing success of numerous Adobe products, so, yes, I do want to see them make money. Of course, in this matter I have learned to trust them to look out for their own interests quite effectively. ;-) �I just wonder about their giving away copies of the Acrobat Reader, you know. High up-front development costs and then -- just where do they get the bucks out of this? Makes me want to sell my Adobe stock, it does. ;-) Sorry, couldn't resist. | If there are legitimate concerns about information access, I'm all | ears. You can convince me to change my viewpoint with facts, but so | far I'm seeing a whole lot of hollering and hype, and I don't see the | fire... And so, convincing people is what this note is for. I hope it was worthwhile reading. But I'm a little disturbed by your use of 'legitimate' here. I hope that concerns raised on behalf of print-impaired people and those whose ideas of information access are more demanding than yours, even though not directly impacting yourself, are nonetheless not labeled lacking in legitimacy. Indeed it's *not* just a matter of information access by a certain fraction of the disabled. SGML documents will enable a higher level of access to all people. At a time when the amount of information published by the Federal government is prodigious already and increasing at a breathtaking pace (doubling roughly every two years, I've heard), having that information in a digestible form is equally important to having access at all. Which means, I'm saying, that access via a page description language is, for some purposes, no access at all. Just imagine doing a computer-aided analysis of a Federal document in PDF form. PDF should not be the primary document's representation; that should be SGML's role. PDF should be offered as one delivery vehicle of many. [Mike Paciello] | ICADD is working hard to block this, as well as the SGML Consortium. | We hope to get your support. For your information, ICADD is a group specifically engaged in broadening access to information by print disabled persons. I suggest that someone with direct knowledge of the organization might wish to post some information saying exactly why they are working hard against the proposal in question. [Yuri Rubinski] | If you have feelings in this area, or want to know what's going on, | contact Mike Rubenfeld at NIST, +1 301 975 3064. And I intend to do just that. I hope others will as well. -- Arthur Ogawa/Publishing Consulant/TeX Consultants ogawa@teleport.com/voice:+1 209 561 4585/fax:4584 Mail: Kaweah CA 93237-0051 PGP-Key: "finger -l ogawa@teleport.com" Newsgroups: comp.text.sgml Date: 12 Oct 1994 17:28:54 UT From: Bowden Wise \ Organization: Rensselaer Polytechnic Institute Computer Science, Troy NY Message-ID: <37h6cm$7f1@usenet.rpi.edu> Subject: Where are DTDs for HTML? Where can I find the DTDs for the various implementations of HTML (e.g., 1.0, 2.0, Plus)? Newsgroups: comp.text.sgml Date: 12 Oct 1994 17:39:40 UT From: "Claude L. Bullard" \ Message-ID: <9410121739.AA14593@source.asset.com> References: \ \ \ \ \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Howard Kaikow] | Personally, I would much rather do everything in SGML, perhaps the | environment I am in will be there some day, not yet tho. [Peter Flynn] | I dunno, there are a lot of jobs more suited to Word in an office | environment right now, while SGML software remains relatively | underused. [Eliot Kimber] | From what I've seen, it appears to be a tool that can fit into a larger | SGML-based strategy that includes a variety of tools, SGML and | non-SGML, appropriate to specific users and tasks. If I may add my two dollars worth. For years I held the same opinion as Mr. Kaikow, but the changing technology and breadth of applications forced me to recant. What Mr. Flynn says is IMO true in one sense: how well the interface matches the user's expectations for the application strongly influences their acceptance of the product. This isn't just an issue of habit; the user interface strongly affects productivity in many ways including the learning curve for the product as well as the ease of entry (repetitive actions, modality of actions, etc.). My real support here goes to Mr. Kimber. I've a lot of faith in Eliot's judgment. If he says Microsoft Word is viable, I accept that. What he adds that is most important IMO is the concept that SGML tools have to be "appropriate to specific users and tasks." While general purpose SGML tools are certainly useful, they tend to also be expensive and difficult to use. This is typically true of most "one size fits all" solutions: snug on some, baggy on others, too tight where it hurts. |-) For newbies who may not understand why I take this position, here is a real world example. First, a WYSIWYG editor is built make it easy to work with what some refer to as the "book metaphor". That is, it represents the printed page or even the book-like display on a screen. *Metaphor* is used here to indicate that a "model" of some real object has been used to design an interface in which the data entry model closely resembles the intended output media. So for element types like this: \ \ and to some extent this \ can productively use an interface which uses the book metaphor. The differences between the two point out something about using a general SGML editor. While in the first case one can reasonably determine with a simple SGML editor what the semantic intent of each generic identifier is, (yes, help with the attributes is needed), in the second case the brevity of the generic identifiers requires a little more knowledge as it has been designed (I'm guessing) to optimized file size for its operating environment. When a more complex editing application is used that either hides the element names or always provides online help for interpreting the semantics, both are equally easy to apply. As both still result, more or less in page-like display, WYSIWYG editors can be used to create them. It's when one encounters an element type like this: \ \ or this \ that special services are required of the user interface metaphor because the book metaphor does not adequately represent them. (Cross my heart, those ARE real world examples. I apologize to Dan Connolly in advance if the expansion of the HTML element type is incorrect. I'm doing this during lunch. |-)) The first is an IDEF model from a re-engineering application. A diagram interface is used with dialog entry objects. The second is from a scripting language for a hypermedia application. The same graphical technique with dialog boxes for data entry works well here. See the Hughes Aircraft AIMMS, the Lockheed Ft. Worth IETM editor, and the KBSI automated modeling tools for examples of this type of interface and representation. In the case of the IETM, the editor is diagrammatic, but the output is an onscreen technical manual that does not resemble a book. In certain cases, while the output format should be available to the author, it is not the best model for the user interface. This is also clearly demonstrated by the Microstar Near & Far DTD editor where the requirements to apply a design methodology in a team of non-SGML literate users has precedence over learning the SGML syntax. Yes, the book metaphor can be used for entering all of these, but it isn't productive on the data entry side and isn't useful or productive to the output side. This is a cautionary tale to those newbies who are trying to build or procure inter-enterprise SGML applications. The strength of SGML is that it can be used to represent many information needs, but all of them cannot be productively authored or presented with a single tool. However, given a well-thought out plan for applying them, SGML allows them to interoperate where needed. ...and that's the beauty of the beast. Len Bullard Newsgroups: comp.text.sgml Date: 12 Oct 1994 17:46:51 UT From: Karen Coyle \ Organization: University of California, Berkeley Message-ID: <37h7oh$kth@agate.berkeley.edu> Subject: SGML to ASCII I've been assigned the un-enviable task of reducing SGML documents to plain ASCII for the users of our information system who are limited to "dumb terminals." This is greatly complicated by the fact that these particular documents are very mathematically oriented. We will inevitably lose some information in this translation, but I would like to hear if there are any accepted standards (true or de facto) for ASCII representation of \

or
TeX expressions.

-- 
Thanks,
Karen Coyle                    kec@stubbs.ucop.edu
University of California
Library Automation

Newsgroups: comp.text.sgml Date: 12 Oct 1994 18:04:44 UT From: Chris Biber \ Message-ID: <9410122107.AA2519@notes.microstar.com> References: \ \ \ Subject: Re: Microstar [John Turner] | Has anyone heard of a company called Microstar? Apparently they have a | graphical interface to SGML. I would appreciate any information about | their product. [Christoph Altenhofen] | I suppose you think of Near\&Far, a tool to handle DTDs in a graphical | manner. I saw a demo on the CeBit in Hannover this spring, but didn't | take a closer look at it. In my opinion it's a good tool to visualize | the dependencies and structure of a DTD, but I don't think that | creating a DTD is made so much easier as the traditional way, coding | DTDs by hand. [Peter Flynn] | I have their demo disk. It's pretty good, but it _is_ only a demo, you | can't actually _do_ much with it. But the idea is nice and I think it | might make a good teaching aid. Me, I'll stick to Emacs and | psgml-mode. | | The disk and doc does say that they encourage copying of the demo, so | I'll zip the contents when I get back to the office and put it on my | ftp server: look next week on www.ucc.ie in pub/sgml for nearfar.zip or | something like that. Mail me to hassle me if it doesn't appear :-) I am always surprised to see how little is known about our product, NEAR & FAR. NEAR & FAR is not only a DTD visualizer but is also the only tool for the graphical _creation_, _modification_ and _maintenance_ of DTDs. Being able to visualize the structure of your DTD is in itself of great benefit to document authors and DTD creators. Being able to visually _create_ the DTD brings measurable productivity increases to the process of DTD design. NEAR & FAR's interaction with a central tag and structure library (CADE Groupware) creates a complete document analysis and design environment. The demo disk (nfdemo.zip and nfdemo.txt) can already be found on the net at the following location: ftp://ftp.ifi.uio.no/pub/SGML/Demo/nfdemo.txt and ftp://ftp.ifi.uio.no/pub/SGML/Demo/nfdemo.zip When looking at the demo disk, please keep the following in mind: - It only includes compiled DTD fragments; the full version imports, validates and displays arbitrary DTDs to any level. - validation, reporting, printing and other advanced features are _not_ supported on the demo disk; they are, however, supported in the normal version of NEAR & FAR. I would appreciate any feedback from N\&F users out there - Case studies, problems encountered, suggestions are all welcome. I would also like to hear from those of you who know NEAR & FAR but have decided not to use it. I personally am somewhat baffled that _anybody_ would still exclusively code by hand (oops... that's my marketing genes speaking :-). What tools are you using to document the DTD? -- /\\ /\\ Chris Biber - cbiber@microstar.com \\/ \\/ Computer /\\ /\\ Aided Microstar Software Ltd. Phone: +1 613 727-5696 \\/ \\/ Document 34 Colonnade Road North Fax: +1 613 727-9491 /\\ /\\ Engineering Nepean Ontario Can/US: 1 800 267-9975 \\/ \\/ CANADA K2E 7J6 Info: cade@microstar.com Newsgroups: comp.text.sgml Date: 12 Oct 1994 18:37:14 UT From: Mark Walter \ Message-ID: \ References: \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Mark Walter] | In simple terms, it maps Word document template styles to SGML elements | and attributes, but, in fact, it handles quite a few of the messy | instances where these two document representations do not map on a | one-to-one basis. It covers everything you can do in Word, including | math and tables. [Richard Matzen] | This seems to imply that the formatting info associated with the styles | is output to the SGML file. However, in all of the Microsoft press | releases I havent seen any references to style info being output. So | my questions are: | | Does anyone know if/how this product handles the exchange of style | (formatting) info? Is any style info output? If so, how complete is | it and in what form is it: attributes? Similar to Rainbow DTD? | | Also, same question for WordPerfect's SGML product. Sorry if I implied it exports style info. It does not. Microsoft believes that's what RTF is for :-) I haven't heard of any plans for FOSI, DSSSL, etc. Ditto for WordPerfect, last I heard . . . Mark Walter mwalter@seyboldpub.ziff.com Newsgroups: comp.text.sgml Date: 13 Oct 1994 00:09:57 UT From: "Wayne L. Wohler" \ Organization: IBM Information Development Strategy and Tools Message-ID: <37htslINNogp@afshub.boulder.ibm.com> Keywords: MSOCHAR MSICHAR SGML Subject: function character parsing I've been doing some work to support the parsing of double byte character sets (DBCS), which in IBM terms means mixing of single and double byte character encoding. In EBCDIC, this is accomplished using two escape characters, one to shift into double byte mode and one to shift back to single byte mode. On the face of it, these two escape characters seem a perfect match for the two function characters defined by SGML MSOCHAR and MSICHAR. In my testing using our parser (a version of YASP), I discovered that this works fine for element content and CDATA attribute values but that this solution does not work for some other contexts, specifically PIs, comments and CDATA marked sections. To test this feature yourself, the following lines need to be placed in the SGML declaration in the FUNCTION section of the syntax defn: SO MSOCHAR nn SI MSICHAR mm -- Wayne L. Wohler Internet: wohler@vnet.ibm.com Dept G82/025Z IBMMAIL: USIB29WX@IBMMAIL Information Development Strategy and Tools Phone: +1 303 924 5943 IBM Corporation PO Box 1900 Boulder, Colorado 80301-9191 Newsgroups: comp.text.sgml Date: 13 Oct 1994 02:55:17 UT From: Dexter Medley \ Organization: Express Access Online Communications, Greenbelt, MD USA Message-ID: \ References: <9409117819.AA781919887@cc-mail.agcomed.uiuc.edu> \ Subject: Re: Accessibility and access to the Internet [Darin Duvall] | I've been pretty good about reading all of my newsgroups lately, and I | haven't read anything that would suggest that adopting PDFs as a | standard would have any sort of harmful effect (see messages below). | Did I miss that part? [Arthur Ogawa] | First, many thanks to Darin for posting this note; I hadn't seen | anything else to date. I intend to contact the NIST with my concerns, | and I hope others in this newsgroup do likewise, regardless of which | side of the question you may fall on. With all due respect to Yuri and others in this thread, it's much ado about (probably) nothing. Here's a little more of the straight poop: NIST has convened a so-called "blue ribbon" panel composed of approximately 25 government agency representatives who are involved in electronic document publishing and dissemination to formulate advice to NIST concerning 1) whether or not there should be a FIPS for a page description format, 2) what capabilities that format should provide, and 3) what that format should be. Toward these ends, Adobe has already made a presentation of the capabilities of its PDF; competing portable document technologies will be presented to the panel at an upcoming session (e.g., Replica, Common Ground, and Envoy are included). The panel will then formulate its recommendations. The reasoning of NIST and the panel (a significant number of whom are from agencies already very active in and committed to SGML) is that such a delivery method may be of use to government agencies for relatively small applications and for a relatively short period of time, until SGML, DSSSL, SPDL, etc., further mature commercially and more COTS practical SGML solutions are available, and that there may be benefits to be gleaned from standardizing on one solution. Virtually no one sees page description formats as a competitor to SGML, but rather as a complement or adjunct. Since someone already mentioned Mike Rubinfeld of NIST, who, along with Larry Welsh, is sponsoring this effort, it would be much more efficient for someone to ask them directly about this program before getting this world-wide representative group in such a tizzy... (Usual dis/claimer: I'm speaking for myself here, not the government; and yes, I'm one of those on the panel, so this is first-hand information.) -- Larry Newsgroups: comp.text.sgml Date: 13 Oct 1994 03:54:52 UT From: Jeff Schwartz \ Organization: Information Resources and Technology Message-ID: \ Subject: HoTMetaL Pro for Macintosh out yet? A couple months ago, a SoftQuad rep told me HoTMetaL Pro would be available for the Macintosh this fall. Anyone have any info on this? -- / Jeff Schwartz / http://edu-153.sfsu.edu / / webdog@aol.com / "A friendly little pothole on / / :::::::::::::::: / the information superhighway." / Newsgroups: comp.text.sgml Date: 13 Oct 1994 05:20:04 UT From: Harry Justice \ Organization: ICCS, Inc. Message-ID: <00985E2D.FB9F9AA0.198@tanuki.twics.com> Subject: Public domain full-text I am looking for public domain full-text engines/source code for unix. Does anyone know of any or know where a good place to look would be? Thanks Newsgroups: comp.text.sgml Date: 13 Oct 1994 05:28:53 UT From: Bill Wiest \ Message-ID: \ Subject: What is SGML? Hi, I am pondering the question "What is SGML?" and I have a few questions to ask here relating to that. Would it be fair to say that the reference concrete syntax is *really* more of a generalized markup language definition language than a concrete example of SGML? Is it true that any markup terms defined using the referenc concrete syntax (or any variant concrete syntax) are in fact concrete examples of SGML? Is SGML per se abstract? Thank you for your help. --Bill Wiest (bwiest@suspects.com) Newsgroups: comp.text.sgml Date: 13 Oct 1994 06:41:43 UT From: Gary Benson \ Organization: Fluke Corporation, Everett, WA Message-ID: \ References: <9409067814.AA781434242@njcorp.akbs.com> <373lbt$qsh@doc.cs.nyu.edu> Subject: Re: Anyone have any info about WORD's recent SGML release? [Robert Ducharme] | "Automatic tagging" may mean little more than taking something tagged | with the MS Word style "whatever" and surrounding it with | "\\" in some kind of exported ASCII files. | I have heard that Microsoft employs many OmniMark programmers and uses | SGML extensively for their CD-ROM products. Given this and the | existence of the Word add-on, their lack of presence on comp.text.sgml | is a bit surprising--or, maybe their silence should tell us something. I don't get it, "...should tell us something". Seems to me there are any number of ways to read this: * Microsoft has no in-house expertise, and doesn't want to advertise that fact while they scramble to get on board. * Microsoft has a grand plan about how they are going to lead the SGML parade and want to keep mum until they have a few more ducks lined up. * Microsoft is not yet sure which way the wind is blowing and is just keeping all bases covered until the direction becomes clearer. .. or is what you had in mind really so self-evident and I'm Just Not Getting It (Again) [TM] ?? -- Gary Benson-_-_-_-_-_-_-_-_-_-inc@tc.fluke.com_-_-_-_-_-_-_-_-_-_-_-_-_-_- Inventions reached their limit long ago, and I see no hope for further development. -Julius Frontinus, 1st century AD Newsgroups: comp.text.sgml Date: 13 Oct 1994 06:54:01 UT From: Gary Benson \ Organization: Fluke Corporation, Everett, WA Message-ID: \ References: <9409067814.AA781434242@njcorp.akbs.com> <373lbt$qsh@doc.cs.nyu.edu> \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Howard Kaikow] | Having now reviewed the Fact Sheet that describes the MS plans for SGML | Author for Word, it would appear that Word would be able to | create/input an SGML file. However, the formatting will still be | restricted by the oversights in Word, e.g., to use my favorite example, | you still will not be able to use dictionary style definitions and do | automagic numbering of clauses and the table of contents. Hard coding | is just unacceptable. | | It is unclear as to what restrictions will be placed on the | specification of a DTD. Yes. My boss read the sheet, picked up on the line about "all the capabilities of Word", and asked, "Is that right? Can SGML do running heads?" My answer (as usual) was, "Of course. It can do hypertext links and index hits and tables of contents, too". He pursued the matter, "But running heads are a page-oriented feature. I thought SGML was about data structures, how can I tell it about page things like footer fonts and other page fixtures?" I admit I did not have a good answer. What did I miss the opportunity to tell him this time? -- Gary Benson-_-_-_-_-_-_-_-_-_-inc@tc.fluke.com_-_-_-_-_-_-_-_-_-_-_-_-_-_- Inventions reached their limit long ago, and I see no hope for further development. -Julius Frontinus, 1st century AD Newsgroups: comp.text.sgml Date: 13 Oct 1994 13:42:49 UT From: Paul Constantine \ Organization: Yale University Message-ID: \ Subject: SGML '94 Could someone either post (re-post?) or e-mail me information regarding SGML '94? Thanks! Paul J. Constantine Electronic Text Center Yale University Library Paul.Constantine@Yale.Edu Newsgroups: comp.text.sgml Date: 13 Oct 1994 13:48:16 UT From: Seong-Joon Yoo \ Organization: Syracuse University CIS Dept. Message-ID: <37jdr0$ea5@newstand.syr.edu> Keywords: SGML user's group Subject: How can I join the SGML user's group? Is there any site that I can send email to join the SGML user's group, especially the SGML database SIG? Please give me any pointers. Thank you. -- Seong-Joon Yoo sjyoo@top.cis.syr.edu Ph.D. Student CIS Syracuse University Syracuse, NY 13244 +1 315 443 1073 Newsgroups: comp.text.sgml Date: 13 Oct 1994 15:33:35 UT From: "W. Eliot Kimber" \ Organization: Passage Systems, Inc. Message-ID: \ References: <9409067814.AA781434242@njcorp.akbs.com> <373lbt$qsh@doc.cs.nyu.edu> \ \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Gary Benson] | Yes. My boss read the sheet, picked up on the line about "all the | capabilities of Word", and asked, "Is that right? Can SGML do running | heads?" | | My answer (as usual) was, "Of course. It can do hypertext links and | index hits and tables of contents, too". He pursued the matter, "But | running heads are a page-oriented feature. I thought SGML was about | data structures, how can I tell it about page things like footer fonts | and other page fixtures?" I admit I did not have a good answer. | | What did I miss the opportunity to tell him this time? The problem is the word "do" as in "It can *do* hypertext". Unfortunately, SGML doesn't *do* anything -- it's merely a data representation system. It's the processors that act on SGML data that do these things. Therefore, you can get running heads from an SGML version of Word because Word does running heads. As long as you can tell it from what part of the SGML-represented data to get the running head information it needs, Word is happy. Ditto for TOCs or index entries or any of the other things that would normally be generated from the core data elements (e.g., section titles get used as the running head text and as the text for TOC entries). This sort of conversation shows the damaging effect that word processors and DTP systems have had: they've caused people to forget that much of what goes on a page or on a screen is transitory and generated, not an intrinsic part of the base information. Word processors (more than DTPs), because they deal purely with the page representation of the information, force us to think about and author explicitly all "page objects", with the true information objects implied by the page objects. SGML-based systems are the reverse: they force us to think explicitly about the information objects and give us the page objects as part of their value add (I almost said "for free", but that's not quite true, although it's almost true). I have worked with structured information for many years, first using line editors like Xedit and VI, then using SGML editors. I always considered myself close to optimally efficient with those systems. Lately I've been forced to use DTP and word processing systems. Let me tell you that these systems are *not* easier to use. It is very much harder to do many things in DTP systems than it is to do them in SGML systems. What is easier to do is to get a particular page image. But consider the problem that Frame and Word both present when you want to represent a simple document type with elements that can occur in many contexts. Because styles in Word and Frame have little or no understanding of their surrounding context, it means that where in SGML I might have the element types UL, OL, and LI, representing nestable ordered and unordered lists, in Frame I'll need the following components to represent just two levels of nesting: UL1item-first UL1item UL1item-last OL1item-first OL1item OL1item-last UL2item-first UL2item UL2item-last OL2item-first OL2item OL2item-last Plus these paragraph components to allow paragraphs within list items: LIPara1 LIPara2 Clearly there is a combinatorial explosion as your documents become more complex and as you try to capture more semantic distinctions with the style names. Compare that with the equivalent SGML element declarations: \ \ \ Using Frame or Word, it is very easy for a user to get the nesting list structure wrong: they'll forget a -first or -last item, they'll put a OL2 before an OL1, etc., With an SGML editor, they can't get it wrong. The problem with SGML editors, IMHO, is not that SGML is hard to work with, but that the user interfaces have not reached the level of smoothness that word processors and DTP systems have achieved. This EUI quality is independent of the actual features the systems provide. For example, I like Frame's user interface a lot -- I find it pretty smooth and well optimized for the tasks, even though the tasks themselves are often much more tedious than they would be if I was using an SGML editor. If Author/Editor or Arbortext had the same EUI qualities as Frame or Word, they'd be hands-down killer apps. This is why I think both the Word offering and FrameBuilder have real potential: they have good user interfaces. At this point, FrameBuilder lacks only true native SGML support and a little better protection of the markers that represent SGML-specific constructs like attributes. But for a Frame user, moving to FrameBuilder will probably be to their advantage because the presence of structure makes Frame *even easier to use*. Even without SGML import/export out of the box, FrameBuilder helps because it makes Frame documents easier to create and makes the Frame data itself more reliable and consistent. The SGML Author product should have the same properties and benefits. -- \

Newsgroups: comp.text.sgml Date: 13 Oct 1994 18:10:20 UT From: Erik Naggum \ Organization: Naggum Software; +47 2295 0313 Message-ID: <19941013T181021Z.enag@ifi.uio.no> References: <37htslINNogp@afshub.boulder.ibm.com> Subject: Re: function character parsing [Wayne L. Wohler] | In my testing using our parser (a version of YASP), I discovered that | this works fine for element content and CDATA attribute values but that | this solution does not work for some other contexts, specifically PIs, | comments and CDATA marked sections. this is certainly not what one expects. I surmise that this interpretation is caused by very literal reading of a few requirements in the standard out of context. re processing instructions: "no markup is recognized in system data other than the delimiter that would terminate it." [341:4] re comments: "no markup is recognized in a {comment}, other than the /com/ delimiter that terminates it." [391:12] there is a note to a similar effect for marked sections. \ but this is not all. MSICHAR, MSOCHAR, and MSSCHAR are not markup, but function characters, so are not "recognized" or "not recognized" to have their effect. from 6.2 SGML Entities: "If an {SGML character} is a {function character}, its function is performed in addition to any other treatment it may receive." [297:20] #\ -- Microsoft is not the answer. Microsoft is the question. NO is the answer. Newsgroups: comp.text.sgml Date: 13 Oct 1994 18:20:18 UT From: Ed Holst \ Organization: Lockheed Aeronautical Systems Company Message-ID: <1994Oct13.182018.11056@enterprise.rdd.lmsc.lockheed.com> References: <9409117818.AA781866386@njcorp.akbs.com> Subject: Re: Change of employer & contact info for Chery Do I know this lady from a past life? Welcome to the happy world of Interneting (sp?). -- PEACE--THROUGH SUPERIOR FIREPOWER THE OPINIONS EXPRESSED HEREIN ARE NOT NECESSARILY MY OWN!! SOMETIMES I JUST CAN'T HELP MYSELF!! eholst@grunt.lasc.lockheed.com Newsgroups: comp.text.sgml Date: 13 Oct 1994 18:46:55 UT From: "Liam R. E. Quin" \ Organization: SoftQuad Inc., Toronto, Canada Message-ID: <1994Oct13.184655.26716@sq.sq.com> References: <36bvg6$n9e@kaa.heidelbg.ibm.com> \ <1994Sep30.024603.29919@midway.uchicago.edu> Subject: Re: Multilingual HTML, SGML documents? [Peter Kerr] | ISO 10646 compliance could be a systematic way to ensure that | authoring/browsing systems work properly. [Richard L. Goerwitz] | I don't think ISO 10646 is the be all and end all of encoding | standards. There will always be simple 8-bit systems, and there will | probably be many many 16-bit, 32-bit, and perhaps still other systems. This is silly (sorry). ISO 10646 does not say anything about the width of an integer. You could implement ISO 10646 on a Commodore PET if you wanted, were silly enough, and had enough time. What (Unicode + 10646) says is, when you see this sequence of input bytes, this is the character that has been represented. when you see this set of characters in this sequence, produce this combination of glyphs. | In fact, what we'll need - for maximum flexibility - is a language at- | tribute *and* an encoding attribute. Both will need to be embedded in | the document. Until recently I would have agreed. I have since understood more about Unicode and ISO 10646. I _do_ agree that you need a language tag. You probably also want a DefaultScript attribute, since you can write (for example) Vietnamese text in a variety of texts, or, if you prefer, you can present Greek or Devanagari (whether Sanskrit or Hindi) in a roman transliteration. It might be possible to put the burden of translating from (say) UTF-1 into ISO 88951-2 in the server, as part of an HTTP negotiation. That wouldn't help mixed-script documents, though. Note that if you want to give every glyph a distinct character, and don't have floating accents, you need more than 256 characters to represent polytonic (Ancient) Greek, because there are more than 256 glyph combinations [see note 1 for examples]. I am using Greek as an example partly because there are many classicists and bible scholars [2] on the net, and partly because it makes some of the issues clearer to people intimidated by large Asian character sets! Why am I saying all this? :-) 8-bit systems cope with this today, usually with floating accents, but also by mixing fonts on the screen, e.g., with escape sequences. An `8-bit-character-set mentality' (I know you were not espousing this) causes considerable problems, not just for multi-lingual documents. A richer system that maps into 8-bit display codes with multiple glyph sets is entirely possible. Don't rule out Unicode or ISO 10646 because you don't have a 32-bit computer! Lee Notes [1] The main Greek accented combinations are \\a /a \\`a \\'a /`a /'a \\~a /~a `a 'a ~a where a can be a, ai, e, h, i, o, u, w, ei; the `ai' indicates an alpha with a tiny iota (iota subscript) under it. \\ and / are lenis and asper (rough and smooth breathing), ' is acute accent, ` is grave accent, ~ is tilde/long/circumflex; in addition, rho can take \\ and / (as in `rhythm'), and in some schemes there can be a dot over or under any letter independently of other accents, for example to mark inferred of corrupted text. You can experiment with the shareware Wingreek package on the PC, which also has limited support for bidirectional Hebrew with vowel marks. [2] New Testament Greek can be done with 8 bits, because it's OK to omit many of the accents. However, at least some printings of the New Testament use the full accents, which is a big hlp to beginners at learning the language (as opposed to the script) like me :-) -- Liam Quin, Manager of Contracting, SoftQuad Inc +1 416 239 4801 lee@sq.com HexSweeper NeWS game;OPEN LOOK+XView+mf-fonts FAQs;lq-text unix text retrieval SoftQuad HoTMetaL: ftp.ncsa.uiuc.edu:Web/html/hotmetal, and also doc.ic.ac.uk: packages/WWW/ncsa/..., gatekeeper.dec.com:net/infosys/Mosaic/contrib/SoftQuad/ Newsgroups: comp.text.sgml Date: 13 Oct 1994 20:07:57 UT From: Simon North \ Reply-To: north@direct.iaf.nl (Simon North) Message-ID: <53@direct.iaf.nl> Subject: Why SGML and NOT Acrobat I, too, would like to throw in my 2 cents to the discussion. I am not an American citizen and didn't (may not) vote for Clinton (or her husband) and so I have little chance to directly influence what happens on your side of the "pond". However, ... America is a member of NATO, and the NATO countries very often follow the American example. As an illustration, I have (draft) copies of JSPs (British "Joint Service Publications" - closes equivalent to US DoD and Mil standards) for IETMs. 90% of the text is copied directly from the three US Mil standards. So, certainly from the defense point of view, whatever "you lot" get up to is extremely important to us in Europe. Please remember that when you go lobbying. More to the point, when the company I work for was taken over by another (foreign) country we were asked/instructed to investigate what the cost would be to convert to using the same DTP/WP software as them. Without going into the gory details, the cost was FRIGHTENING. Even if the company was prepared to spend the money, the resources are simply not available (time & manpower mainly). Filtering helps a little but even with these packages (believe it or not, FrameMaker and Interleaf), a lot of manual intervention is still required and what is far, far worse, the results are unpredictable. Even within our software packages, migrating from version to version NEVER goes without problems ... and, as pure customers of these products we are at the "blunt end", often left with no choice but to accept what the manufacturer thinks we want, or knows what is the best consensus of the total market requirements. The issue is not just accessibility. If I were that bothered I can either distribute a run-time viewer with the documentation, or I could simply use something like (Display) PostScript (also an Adobe product). I am prepared to grant the point that our interests and needs are very specific (longevity of the data must be in the region of 15+ years), but THE reason that SGML is so attractive is not that it is platform independent - far more importantly it is VENDOR independent - or at least, it has the potential to be so (pause for a sigh in the direction of Microsoft, Frame Technologies and Interleaf ... boy, you guys are making it soooooo hard with your not-quite-SGML applications that are giving me even more grey hairs working out out how to get SGML into your products and then get it out again). SGML is not THE answer, yet, (far too document and sequentially oriented), HyTime gets closer (I'm still waiting for the first real application software), maybe SGML2 will be THE answer. Whatever, the important thing is that SGML is an ISO standard which, IMHO, makes it OUR property NOT that of some software company ... Needless to say, this is MY e-mail account and also my opinion. -- Simon North, Technical Author/Translator, SGML & Internet Consultant Private (TCP/IP): Directions (uucp): Daytime job: north@knoware.nl north@direct.iaf.nl db512@hgl.signaal.nl Newsgroups: comp.text.sgml Date: 14 Oct 1994 02:23:12 UT From: Evan Rudowski \ Organization: The Pipeline Message-ID: <37kq2g$ivv@pipe2.pipeline.com> Subject: Re: CALL FOR SGML NEWS ITEMS for SGML '94 [Yuri Rubinsky] | C A L L F O R B I T S | _________________________ | | By now a five-year tradition, the SGML Year in Review continues to open | both the North American and European SGML series of conferences. | | As always, it attempts to represent exciting new projects and | initiatives as well as provide updates on on-going work of interest to | the conference attendees and to the readers of this newsgroup and the | half-dozen newsletters that now reprint the presentation. | | SGML '94 is fast approaching (Nov 6 tutorials; Nov 7-10 conference; Nov | 10-11 SGML Open events), and with it, the Year in Review Call for | Contributions. How can I get details about SGML '94? Thanks. -- Evan Rudowski Newsday Electronic Publishing EVANRUD@PIPELINE.COM Newsgroups: comp.text.sgml Date: 14 Oct 1994 03:51:38 UT From: Tim Arnold-Moore \ Organization: CITRI Message-ID: <37kv8a$6p3@etrog.citri.edu.au> References: <374l6q$fuv@usenet.rpi.edu> <3757ki$4e5@scapa.cs.ualberta.ca> Subject: Re: Storage Models for SGML data [Chiradeep Vittal] | Documents are complex entities (hierarchical, as you point out). An | RDBMS is hardly the ideal candidate to model this complexity. An | object oriented DBMS would be the ideal candidate because it can do all | the 3 things you have listed above. There is a paper around describing exactly such a system: V. Christophides, S. Abiteboul, S. Cluet and M. Scholl "From Structured Documents to Novel Query Facilities" in Proceedings of the ACM SIGMOD Conference 1994, Minneapolis, MN, pp313-324. However I am not convinced that this approach is actually feasible for any reasonable sized collection. OODBMS's are just too slow and you end up having to do an awful lot of join operations. The work at INRIA is interesting: G.E. Blake, M. P. Consens, P. Kilpelainen, P.A. Larson, T. Snider and F.W. Tompa "Text/Relational Database Management Systems: Harmonising SQL and SGML" in Proc. Int. Conf. on Applications of Databases, 1994, Vadstena, Sweden, pp 267-280. as it is based on PAT trees, a novel indexing/data modelling approach. Here at CITRI we have also done some work criticising both of these approaches, preferring an element-based text-specific model over a mapping to another paradigm, as this brings the issue of efficiency to the for in designing the model as well as implementing it. Because we start with a text-oriented model, we can take advantage of all the work in efficient access that has taken place in this field. A tech-report of our work is available and I am quite happy to send a postscript or dvi file to anyone who wants it. | I'm currently implementing an SGML/HyTime database on a commercial OO | DBMS. A tech report should be available within a couple of weeks. Our | target application is multimedia news, and we assume zero updates to | the document, so our storage model may not be entirely appropriate for | your needs. Might work for small documents like newspaper articles and smallish collections but I bet you have serious performance problems with gigabyte size databases and large documents. -- Tim Arnold-Moore | CITRI, RMIT | Uni. of Melbourne Law School tja@kbs.citri.edu.au | 723 Swanston St | ---------------------------- Phone: +61 3 282 2487 | Carlton 3053 | simul iustus Fax: +61 3 282 2490 | Victoria, Australia | et peccator Newsgroups: comp.text.sgml Date: 14 Oct 1994 05:15:27 UT From: "Richard L. Goerwitz" \ Organization: University of Chicago Message-ID: <1994Oct14.051527.25155@midway.uchicago.edu> References: \ <1994Sep30.024603.29919@midway.uchicago.edu> <1994Oct13.184655.26716@sq.sq.com> Subject: Re: Multilingual HTML, SGML documents? [Richard L. Goerwitz] | I don't think ISO 10646 is the be all and end all of encoding | standards. [Liam R. E. Quin] | This is silly (sorry). ISO 10646 does not say anything about the width | of an integer. You could implement ISO 10646 on a Commodore PET if you | wanted, were silly enough, and had enough time. It's only silly because you misunderstood. The process of encoding a writing system is much more complex than a simple mapping of glyphs onto 8-, 16- or 32-bit numbers (or bit-fields). Many issues will go on being debated for a long, long time. ISO 10646 will never satisfy everyone, and it makes sense to allow latitude for alternate encoding schemes. | Don't rule out Unicode or ISO 10646 because you don't have a 32-bit | computer! Again, this has nothing to do with what I was saying - although it's a perfectly valid point to make here. | [2] New Testament greek can be done with 8 bits, because it's OK to | omit many of the accents. However, at least some printings of the | New Testament use the full accents, which is a big hlp to beginners | at learing the language (as opposed to the script) like me :-) Okay, now try fully cantillated Hebrew, or Quranic Arabic, or (better yet) Akkadian cuneiform. -- -Richard L. Goerwitz goer%midway@uchicago.bitnet goer@midway.uchicago.edu rutgers!oddjob!ellis!goer Newsgroups: comp.text.sgml Date: 14 Oct 1994 08:38:55 UT From: Johan Henrikson \ Message-ID: <9410140838.AA04551@lars.texcel.no> Subject: SGML and the EEC I'm looking for more information regarding SGML and the EEC. Rumors have it that the EEC is increasing focus on ODA (!) and might be dropping SGML from their recommendation. Despite the absurdity of this rumor I would like to check out the current status of SGML in the EEC. If anyone has any suggestions pertaining to relevent documents, government agencies, etc to contact please get in touch with me. Thank you in advance, Johan Henrikson -- Johan Henrikson E-mail: johan@texcel.no Texcel AB Adress: Storsaetragraend 12, 127 39 Skaerholmen, Sweden Newsgroups: comp.text.frame,comp.text.sgml Date: 14 Oct 1994 09:37:10 UT From: Wiedmer Hans Ulrich \ Organization: Inst. for Machine Tools and Manufacturing, ETHZ Message-ID: <37ljg6$26l@elna.ethz.ch> Subject: SGML-related Frame Products: no university licensing?! Dear all, I am told by a local Frame distributor that there is no university licensing available. This holds for every Frame Product related to SGML (i.e., FrameBuilder, SGML Toolkit, SGML Utilities etc.). It sounds like this is just the policy of Frame Technology Inc. DO OTHER UNIVERSITIES EXPERIENCE THE SAME SITUATION ?????? - if yes: is there an understandable reason for Frame Technology to do so? - if no: could anybody give me an address of a Frame distributor in the "neighbourhood" (whatever that may mean, let's say Europe). My apologies to those who see this twice (you're the right people to address! but there is no possibility to address only the intersection of the two newsgroups). -- Kind regards John Wiedmer Swiss Federal Institute of Technology (ETH) Zurich Newsgroups: comp.text.sgml Date: 14 Oct 1994 12:06:04 UT From: Kent Summers \ Organization: Electronic Book Technologies, Inc. Message-ID: <37ls7c$917@spock.ebt.com> References: <374cqn$3l8@bmw.hwcae.az.Honeywell.COM> <3762vk$irc@deep.rsoft.bc.ca> <37c359$j04@bmw.hwcae.az.Honeywell.COM> Subject: Re: SGML "Viewers" [Kevin Brennan] | Temporary Revision capability allows the using orginazation to provide | revisions to the text contained on the CDROM without cutting a new | CDROM. For example, an airline using maintenance manuals on CDROM may | receive revisions to the manual *between* issuances of new CDROMS. | Jouve's software puts a revision file on the local hard drive and | replaces or annotates text appropriately. In the case of a networked | resource, there can be a shared file on the network. There of course, | is a seperate TR software (optional at some cost) that the Maintenance | Publication department would have to acquire. I'd like to point out that before you can really deliver incremental updates to a browser (local or remote) you really need to be able to detect and track deltas in the context of a sophisticated SGML version control/configuration management system. This is exactly what dynabase is designed to do: provide the electronic equivalent of "printed change pages," but at a much lower level of granularity --> an SGML-based, fine-grain revision control environment. Since EBT makes both dynabase and dynatext, one could expect that we would be making this incremental update capability an intergral part of our products in the future. CD-ROM? well, we've heard a lot of them over the past year; and today, it's internet this, internet that. are you starting to get the message? do it once, and send it out your fovorite way. SGML is the key. EBT is the enabler. stay tuned. Newsgroups: comp.text.sgml Date: 14 Oct 1994 12:36:13 UT From: Yuri Rubinsky \ Organization: SoftQuad Inc., Toronto, Canada Message-ID: <1994Oct14.123613.21924@sq.sq.com> References: <9409117819.AA781919887@cc-mail.agcomed.uiuc.edu> \ \ Keywords: nist, fips, braille, document, icadd Summary: Call it a FIPS for a portable page display format Subject: Re: Accessibility and access to the Internet [Dexter Medley (Larry)] | With all due respect to Yuri and others in this thread, it's much ado | about (probably) nothing. Here's a little more of the straight poop: | | NIST has convened a so-called "blue ribbon" panel composed of | approximately 25 government agency representatives who are involved in | electronic document publishing and dissemination to formulate advice to | NIST concerning 1) whether or not there should be a FIPS for a page | description format, 2) what capabilities that format should provide, | and 3) what that format should be. Toward these ends, Adobe has | already made a presentation of the capabilities of its PDF; competing | portable document technologies will be presented to the panel at an | upcoming session (e.g., Replica, Common Ground, and Envoy are | included). The panel will then formulate its recommendations. (Just out of curiosity, and before I get going here, do I understand this correctly, that PDF got a whole day on its own, and all possible alternatives get lumped into just one day? Just curious.) Larry's first paragraph touches on my main area of concern, and it involves defining the task and therefore defining the scope of the proposed US Federal Information Processing Standard. Larry calls the FIPS "a page description format" which I'm completely comfortable with, but also calls it a "portable document technology". I have no problem with page description format, or page display format, or even portable page display format. My concern is with the presentation of a format geared specifically to page description *as if it were* a "document format". For the purposes of this discussion, I have only one constituency -- and it's not the SGML vendors: I work on the International Committee for Accessible Document Design, the committee working on the accessibility of *documents* by the visually impaired. Note that "documents" in that sentence is used very particulary to imply that there is a body of information pulled together from whatever sources, in whatever fashion, waiting, so to speak, to be presented appropriately to its user/reader. I believe that that's the only fair use of the word document, in fact, or else people who can't see the paper, or who can't make out the words or letters on the paper, are completely disenfranchised from the world of documents. Certainly we also get to call those old-fashioned bits of squished pulp with type on them documents too, but a standard that purports to deal with "documents", I think, *must* use the broader definition, the definition that includes the ICADD constituency. Or else, as far as I understand US law, they're in fact breaking it. The Americans with Disabilities Act is intended to prevent government from creating non-accessible information. (I've not read that law; just basing this observation on rumours about that law.) | The reasoning of NIST and the panel (a significant number of whom are | from agencies already very active in and committed to SGML) is that | such a delivery method may be of use to government agencies for | relatively small applications and for a relatively short period of | time, until SGML, DSSSL, SPDL, etc., further mature commercially and | more COTS practical SGML solutions are available, and that there may be | benefits to be gleaned from standardizing on one solution. Virtually | no one sees page description formats as a competitor to SGML, but | rather as a complement or adjunct. This is admirable, and I trust that the FIPS will include a discussion of its temporary status. Adobe contributed heavily to the development of SPDL, after all. Again, note the use by Dexter/Larry here of "page description formats", a completely acceptable term. I hope the FIPS uses the same language. | Since someone already mentioned Mike Rubinfeld of NIST, who, along with | Larry Welsh, is sponsoring this effort, it would be much more efficient | for someone to ask them directly about this program before getting this | world-wide representative group in such a tizzy... I stand by my remarks. If there is a tizzy it's because it's worth while there being a tizzy. If the FIPS -- even by virtue of its name -- pretends to be more than it is, I will continue being tizzified. | (Usual dis/claimer: I'm speaking for myself here, not the government; | and yes, I'm one of those on the panel, so this is first-hand | information.) | | -- Larry I appreciate that we have first hand information. I take it that the flip side of that is that we have someone listening who is in a position to make good on the context and role described in his posting -- and will do his best to ensure that the 7% of Americans who have a print impairment are not disenfranchised by a display format that (assuming it's stock PDF): a) cannot, as far as I know, give them the navigation they need since it's lost all the hierarchy that sighted people take for granted. It is foolish to get a screen reader to read you converted PDF pages, in sequence. If you're sighted, how would you like to use a cassette tape recorder to look up something in an encyclopaedia? For sensible navigation you need something that understands the value of presenting the contents as a navigable tree structure, something like IBM Book Manager, Lector, Olias, DynaText, Pathways, SoftQuad Explorer, etc. And the information for that navigable structure *must* come from somewhere, for example, it could simply be implied by the SGML markup. b) cannot, as far as I know, give people who need large print the opportunity to set a default minimum size for all type -- that is they must use PDF as if it holds a printed page: If they want to make a 6point footnote legible as a 24point display, they're also, of necessity, turning the 12point base text into outrageously large 48point. c) cannot -- despite what was said in a previous posting -- be readily turned into Braille because it has already lost the markup needed to generate the blanks, indents and special characters that are the only actual formatting capabilities of Braille. (When we see an SGML-enhanced PDF I'm willing to grant this one.) (In fact, probably all my concerns would go away under that circumstance.) I would be very happy to discover I'm wrong about any of these assumptions. I'll retract anything anytime. But nothing I've seen from the postings in this newsgroup or many phone conversations on this subject convince me otherwise. Don't get me wrong: I'm a fan of Adobe and what it's done in the world. I think PDF solves or partially solves a large set of problems that need solving. I'm only concerned about it purporting to solve *all* the problems -- even ones it wasn't designed to solve. And it all stems from calling itself a "portable document format" as if documents are only for people who can see. Yuri Rubinsky Newsgroups: comp.text.sgml Date: 14 Oct 1994 14:14:20 UT From: Erik Ableson \ Organization: Achilles Networking Inc, Ottawa, ON Message-ID: <37m3ns$j7c@hermes.achilles.net> Subject: DTD2HTML I recently ran across a msg containing a reference to a utility called DTD2HTML, designed to simplify the process of converting SGML data into a format suitable for use with a WWW server. If anyone can point me to a site that has this software (or the mfg, if it's commercial) it would be appreciated. -- Erik Ableson Ableson Consulting eableson@zeus.achilles.net P.O. Box 1641, Stn 'B' +1 613 762 5299 Hull, PQ J8X 3Y5 Newsgroups: comp.text.sgml Date: 14 Oct 1994 14:48:06 UT From: Peter Flynn \ Organization: University College Cork Message-ID: \ References: <37m3ns$j7c@hermes.achilles.net> Subject: Re: DTD2HTML [Erik Ableson] | I recently ran across a msg containing a reference to a utility called | DTD2HTML, designed to simplify the process of converting SGML data into | a format suitable for use with a WWW server. No, this just reformats and displays a DTD in HTML form, not an instance, as far as I know (but I'd love to be proved wrong :-) ///Peter -- Peter Flynn | pflynn@curia.ucc.ie | ...persuade users that spreading Computer Centre | +353 21 276871 x2609 | fonts across the page like peanut University College | +353 21 277194 (fax) | butter across hot toast is not the Cork, Ireland | Opinions are my own. | route to typographic excellence... Newsgroups: comp.text.sgml Date: 14 Oct 1994 14:53:18 UT From: Rick Jelliffe \ Message-ID: \ References: \ <374tum$rff@hopper.acm.org> Subject: Re: response to miniSGML [Dave Peterson] | ISO 8879 certainly encourages this "marker in the text" point of view. | While ordinary elements have their content bracketed by a pair of tags, | even if that content happens to be empty, EMPTY elements do not have a | pair of tags, just one tag that says "here goes something special". Goldfarb writes that "An element has a start-tag and and end-tag, but there is a situation in which one of these might not be there... Technically, the content always exists, even if it is empty and looks as if it isn't there." (p307: bad pencil means maybe bad transcription) His comment on 7.3 is also clarified in the Attachment 1 printed in the Handbook. For example, if there is an explicit content reference in a start tag, then (even though an end-tag is not allowed in the markup) the data for that explicit content reference (which the application provides) is treated as if nested between an start tag and end tag. (It is not omittag minimization, because that is discretionary: this is enforced.) But I think we have fallen into the trap of trying to make the standard mean only one clear and definite thing. I resign immediately (-: -ricko -- Rick Jelliffe ricko@allette.com.au Newsgroups: comp.text.sgml Date: 14 Oct 1994 15:55:25 UT From: Dave Peterson \ Organization: ACM Network Services Message-ID: <37m9ld$e50@hopper.acm.org> References: \ <374tum$rff@hopper.acm.org> \ Subject: Re: response to miniSGML [Dave Peterson] | ISO 8879 certainly encourages this "marker in the text" point of view. | While ordinary elements have their content bracketed by a pair of tags, | even if that content happens to be empty, EMPTY elements do not have a | pair of tags, just one tag that says "here goes something special". [Rick Jelliffe] | Goldfarb writes that "An element has a start-tag and and end-tag, but | there is a situation in which one of these might not be there... | Technically, the content always exists, even if it is empty and looks | as if it isn't there." (p307: bad pencil means maybe bad | transcription) | | His comment on 7.3 is also clarified in the Attachment 1 printed in the | Handbook. Goldfarb's interpretation is his own, as is mine. | But I think we have fallen into the trap of trying to make the standard | mean only one clear and definite thing. I resign immediately (-: I sincerely hope that SGML can evolve to interpretations not envisioned by its creators, without destroying the original interpretations. Such evolution is not uncommon. Else why is the mathematical study of periodic phenomena called "trigonometry"? Dave Peterson SGMLWorks! davep@acm.org Newsgroups: comp.infosystems.www.providers,comp.text.sgml Date: 14 Oct 1994 16:37:30 UT From: Brooks Cutter \ Organization: AT\&T Paradyne Message-ID: <37mc4a$me8@pdn.paradyne.com> Subject: SGML/HTML Style Sheets? A while ago, I read about a X11 WWW browser (Viola?) that was described as having a "nifty" feature of supporting SGML Style sheets... I know little about formal SGML, but I presume that a Style sheet allows one to define the display layout of a SGML document? Can someone please point me towards more information on this? I've looked around the web (including w3catalog), and haven't found any info on this... I've been developing a perl library for parsing and displaying HTML (for a browser I'm working on), and would like to make the parser/display library as generic as possible, and support style sheets... thanks for any info you can provide.. -Brooks bcutter@paradyne.att.com -- Brooks Cutter (I: bcutter@paradyne.com) (U: ...uunet!pdn!bcutter) Unix System Administrator, Engineering Services AT\&T Paradyne, Largo Florida (Opinions expressed are my own and do not reflect those of my employer) Newsgroups: comp.text.sgml Date: 14 Oct 1994 17:23:17 UT From: Dave Petersen \ Organization: ACM Network Services Message-ID: <37meq5$ef4@hopper.acm.org> References: \ Subject: Re: What is SGML? [Bill Wiest] | I am pondering the question "What is SGML?" and I have a few questions | to ask here relating to that. | | Would it be fair to say that the reference concrete syntax is *really* | more of a generalized markup language definition language than a | concrete example of SGML? The syntax "productions" that define the language of SGML start by defining "SGML document" in terms of other things (called by comp sci language specialists "non-terminals"), which in turn are defined in terms of yet other things,...,ultimately in terms of characters. Just before the last step, each special character string used for SGML punctuation is given a name. Then the last step maps these names onto specific character strings. For example, the opening character string for an end-tag is named "etago" (for end-tag open), and is "usually" mapped to ", if you have access to that newsletter. (I wrote the article.) | Is it true that any markup terms defined using the referenc concrete | syntax (or any variant concrete syntax) are in fact concrete examples | of SGML? Presumably a "concrete example of SGML" is an SGML document. Presumably also a "markup term" is a non-terminal such as "start-tag" or "stago" -- or "SGML document" -- as defined in one of the productions of ISO 8879. If this is what you mean by each of these phrases, then the answer is "no". Just as "book" (the abstract class) is not a book, "SGML document" is not an SGML document. But I suspect this distinction is something most of us would leave to the philosophers. | Is SGML per se abstract? Unless you have some technical meaning in mind for "abstract", I think this one too gets passed to the philosophy section. Did I guess your meanings correctly? Hope I helped. Dave Peterson SGMLWorks! davep@acm.org Newsgroups: comp.text.sgml Date: 14 Oct 1994 18:46:28 UT From: Joel Finkle \ Organization: Searle R\&D Message-ID: <1994Oct14.184628.21488@tin.monsanto.com> References: <53@direct.iaf.nl> Subject: Re: Why SGML and NOT Acrobat [Simon North] | SGML is not THE answer, yet, (far too document and sequentially | oriented), HyTime gets closer (I'm still waiting for the first real | application software), maybe SGML2 will be THE answer. Whatever, the | important thing is that SGML is an ISO standard which, IMHO, makes it | OUR property NOT that of some software company ... Personally, I'd like to see a hybrid file format that can serve both masters: 1) A page-turner application that can display the full appearance of the original paper document (that regulations still require) and 2) A structured document editor that allows me to search and extract the portions relevant to me Is seems to me than an Acrobat-like file format could be extended to contain the SGML-style tags, that are not (necessarily) visible to the page-turner, but extractable in original SGML form. Displaying an SGML doc on the fly is slow, especially when you want to see something in the middle. Having just a page-turner limits the content-based retrieval. A cross-breed is what I'd like to see. -- Joel Finkle Searle R\&D jjfink@skcla.monsanto.com "And when I die don't bury me / in a box in a cemetery. Out in the garden would be much better, I could be pushin' up home grown tomatoes" -- Guy Clark, "Home Grown Tomatoes" Newsgroups: comp.text.sgml Date: 14 Oct 1994 19:23:35 UT From: Simon North \ Message-ID: <57@direct.iaf.nl> Subject: DynaText will not print. Any ideas? Under UNIX, the standard DynaText browser is unable to print bitmap illustrations from a separate window. Using our customised version of the user interface (built with the Developers' Toolkit), we can't get DynaText to print _any_ graphics at all from the main text window. Under MS-Windows, the standard interface works fine (after some juggling with printer drivers) but the same custom interface is unable to print vector graphics from the main text window. Has anyone else been experiencing similar (or even different) problems? Desperate minds would like to know. -- Simon North, Technical Author/Translator, SGML & Internet Consultant Private (TCP/IP): Directions (uucp): Daytime job: north@knoware.nl north@direct.iaf.nl db512@hgl.signaal.nl Newsgroups: ba.seminars,alt.hypertext,comp.text,comp.text.sgml,comp.multimedia,comp.unix.misc,comp.graphics Followup-To: poster Date: 14 Oct 1994 20:21:02 UT From: Paul Fronberg \ Organization: Sun Microsystems Inc., Mountain View, CA Message-ID: <37mp7e$j2f@engnews2.Eng.Sun.COM> Keywords: text, multimedia, hypertext, sgml, hytime, document formating Subject: SVNet Meeting October 19: Extending SGML, Hytime, Hypertext, & Multimedia SVNet Meeting, Wednesday, October 19, 7:30 pm in Mountain View SVNet is a SF Bay area UNIX and Open Systems user's group which sponsors technical presentations at its monthly meetings. The meetings are free and open to the public. The next presentation will be: WHAT: Extending SGML: HyTime, Hypertext, and Multimedia Topics to be covered include: - basic SGML concepts and capabilities - the ``formatting fallacy'' - documents as databases - SGML architectural forms and the object-oriented paradigm - HyTime extensions to SGML - HyTime and inter-operable multimedia WHO: Ralph Ferris, Fujitsu Ralph Ferris is the Project Manager for Electronic Publications at Fujitsu Open Systems Solutions, Inc. (FOSSI), a wholly-owned subsidiary of Fujitsu, Ltd. My work is sponsored by the department within Fujitsu's Middleware Division that is responsible for developing word processing and electronic publishing products. His areas of responsibility include market research on hypertext and multimedia, SGML/HyTime tool development, and representing Fujitsu and FOSSI in various industry work groups. WHEN: Wednesday, October 19, 1994 at 7:30 pm WHERE: Sun Microsystems Bldg 6, 2750 Coast Avenue, Mountain View Coast Ave appears to be just a driveway next to Bldg 5 on Garcia Ave between Amphitheatre Pkwy and San Antonio, so don't get confused. For more information, please call either Paul Fronberg at +1 415 366 6403 or Ralph Barker at +1 408 559 6202. SVNet is a UNIX and open systems user group supported by member dues and donations. SVNet Meetings are FREE and OPEN TO THE PUBLIC. UNIX is a registered trademark of X/Open Newsgroups: comp.text.sgml Date: 14 Oct 1994 21:26:48 UT From: "Christopher R. Maden" \ Organization: Electronic Book Technologies, Inc. Message-ID: \ Subject: Correction re: SGML Author for Word I just spoke with Dale Waldt of \, concerning Microsoft's SGML Author. I was incorrect in my interpretation of his article. Regarding the debate on auto-numbering, Microsoft tells him that anything Word can do, SGML Author for Word can do. That is, since Word can auto-number, SGML Author can as well. He also clarified (and mentioned that his article was written under deadline) that in SGML Author, one is not authoring in SGML. It uses more complicated Word stylesheets to translate the Word document into SGML, in much the same way as Avalanche (whose technology it uses) or DynaTag. In other news, Microsoft asks that people talk to their sales representatives for more information about SGML Author; there are so many rumors going about, and the sales reps have the inspired Word Of Gates. :-) -Chris -- Christopher R. Maden Electronic Book Technologies, Inc. Applications Consultant One Richmond Square crm@ebt.com Providence, Rhode Island 02906 USA +1-401-421-9550 (voice) +1-401-421-9551 (fax) Newsgroups: comp.text.sgml Date: 14 Oct 1994 21:34:18 UT From: "M. James Bartley" \ Organization: Bell-Northern Research Ltd., Ottawa, Canada Message-ID: <37mtgq$l1e@bmerhc5e.bnr.ca> References: <53@direct.iaf.nl> <1994Oct14.184628.21488@tin.monsanto.com> Subject: SGML vs PDF: DSSSL alters the equation? [Joel Finkle] | Personally, I'd like to see a hybrid file format that can serve both | masters: | | 1) A page-turner application that can display the full appearance of | the original paper document (that regulations still require) | | and | | 2) A structured document editor that allows me to search and extract | the portions relevant to me The problem with displaying SGML is deciding how to format it, right? And DSSSL is supposed to provide all the answers to this problem, right again? PDF is great for a lot of things - I've got about a dozen source applications whose files I need to distribute to people and SGML is not an option in this particular case, so PDF is a very nice solution. But, if you *do* have SGML, rendering it into PDF is still going to require formatting decisions (whether you embed the SGML into the PDF or not), so wouldn't it be best to use the standard designed for this? And if you have DSSSL at the origin (the editor), then you're bound to have DSSSL at the destination (the browser), so what do you need PDF or some other (possibly new hybrid) format for? Why not concentrate on getting the DSSSL support into the SGML editors and browsers so that people will be a lot happier about this situation than they are today? How many SGML tools vendors are making efforts to incorporate DSSSL support into their products? And if anyone says that since DSSSL isn't finished yet so it can't be supported, I'd like to remind them that unfinished standards have never stopped vendors from going after potentially lucrative markets before. Just think ... with HTML getting its wrinkles ironed out so that the WWW will finally have the opportunity to do SGML properly in the form of HTML and take advantage of real SGML tools (have you ever tried to actually *find* something on the WWW? -- it's an absolute nightmare), this is the time to show the world what full-blown SGML can do. HTML browsers are free. PDF browsers are now free. If you want to sell your SGML tools, deliver an SGML browser for free. Show people just how minimal HTML is and they'll beat a path to your door to buy your editor. But that browser had better have a relatively decent (albeit basic) feature set, and that means you'll need DSSSL. I've got people here (more and more every day) who are turning Frame docs into HTML by the boatload and they love the results. It makes me cringe in pain because I try to tell them what they're missing but they think that what they have now is the neatest thing since sliced bread. The way to get through to them is to *show* them the real thing, but there's nothing there to show them. Give me that browser and I'll give you customers, but the browser has to do the job properly - not fancy, just right. -- M. James Bartley, Bell-Northern Research Ltd. \\ Internet: mjamesb@bnr.ca P.O. Box 3511, Station C, Ottawa, Canada, K1Y 4H7 \\ Phone: +1 613 763 3556 ** Opinions expressed are mine, not BNR's. - MJB ** \\ Fax: +1 613 763 5048 "Stupid pigeons!" Newsgroups: comp.text.sgml Date: 14 Oct 1994 22:04:58 UT From: Russ Chamberlain \ Organization: WATFAC Message-ID: \ References: <3762qj$4p6@finnegan.iol.ie> Keywords: minor, problems Summary: I've done it with WATCOM C for DOS. Subject: Re: YASP, does anyone have a MS-DOS copy? [Charles Hurley] | Does anyone have a MS-DOS copy of YASP? [Sean Mc Grath] | I'd like to piggy back on this one if I may. I have looked through the | compilation scripts, etc., for YASP but I do not have access to an OS/2 | box to compile up a DOS version as required. | | I guess with a bit of work the REXX script could be analysed to produce | a DOS makefile for Microsoft or Borland and the code compiled directly | on a DOS based PC. If someone has already been done, I would | appreciate any information. I managed it in about a day, using WATCOM C. Yes, re-creating the build scripts is a requirement, but it's not too hard. For DOS, create the TOOL, MDC, and CTX libraries. In each makefile, just keep the source-file dependencies, tear out everything else, and insert your favourite compiler and library directives. A key file to have a look at is PARSER\\INCLUDE\\PORTABLE.H. This file contains all sorts of definitions that will vary depending on the target machine, operating system, and memory model. For some reason unknown to me, the pascal and cdecl calling conventions are used extensively. Fortunately, it was a simple matter to define the relevant macros as empty. Note that at least one file, OCSMIN.H (I think), has some missing end-of- comment sequences (*/). I recall having a handful of _very_minor_ problems, one related to the use of const. Otherwise, all files compiled up just fine. The same makefile that makes the CTX library combines the three libraries into the YASP library. As for the test applications, SMP00 is the one you need _and_ require to make first. This application generates OCS (Compiled SGML Declaration) and ODT (Compiled DTD) files. Note that the makefiles aren't kidding when they define a 40K stack. I used a 20K stack on small documents, and sometimes got a stack overflow. To get error messages, make sure the file called YSP.TXT is in the current directory. The RAST application just spits out the ESIS, but in a different format than SGMLS does. Speaking of SGMLS, I'd like to thank the people responsible for creating YASP. Such a clean, elegant, _understandable_, and *free* solution is, IMHO, a great stride forward for SGML and its supporters. I know I speak for many of us when I say: THANK YOU! THANK YOU! THANK YOU! THANK YOU! THANK YOU! THANK YOU! Good luck! - Russ PS - Sorry, I've only got advice today, and no binaries :-( Sorry this is terse, but I gotta go... -- Russ Chamberlain Waterloo Foundation For The Advancement Of Computing (WATFAC) russc@csg.uwaterloo.ca >> "The above are MY views, not CSG's, UW's or WATFAC's" Newsgroups: comp.text.sgml Date: 14 Oct 1994 22:13:28 UT From: "Steven J. DeRose" \ Organization: EBT, Inc. Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> \ Subject: Re: Storage Models for SGML data [Peter Flynn] | This is crucial and remains unaddressed by any SGML software I have | seen. Before you all jump on me and tell me (a) you can say \

or similar; or (b) that editors etc can activate on-screen | section numbering deduced from \

or similar elements, what I mean | is support for Canonical Reference. | | If I am in the midst of an instance, having got there via, say, a | search routine which has returned the exact occurrence of "foo" that I | was looking for, I now want to know where I am: I want the database | system to return not just "`foo' found at offset 123456" but something | like "`foo' found in \ #14 in \ #5 in \ #8 in \ | #3", according to some rules I specified before the search saying that | for all hits I want the structure returned using such-and-such a | hierarchy. In DynaText(tm), use the TEIgen() property-value function. It generates a TEI-conformant extended pointer to whatever element you specify. It supports three variants of the TEI syntax: Either TPATH or PATH from version P1 of the TEI guidelines (now replaced), or the P3 (current) form. P3 is the default; P1 TPATH is almost exactly like your example (as of course you know, Peter). teigen() is smart enough to use the nearest ancestor with an ID and then work down the tree path from there, so you get as robust a pointer as possible. If you're an end user you can't do a lot with the pointer other than display it unless your publisher gives you a place to store it and use it later. But if you have a document using extended pointers, you can set up links in your stylesheet easily to interpret it, using our tei() function. The temporary snag is that tei() only supports P1 syntax so far. The TEI syntax has also been adopted by some other software, and given the enormous amount of data being encoded with TEI, it is incumbent on implementors to support it (it's also dead simple to do -- my C code for interpreting the P1 syntax is 163 lines including comments). See our doc for more details on all the above, but here's a trivial example of a TEI P1 link using the TPATH syntax (the same example can be used across documents merely by naming the destination entity too): \ I leave the corresponding HyTime syntax as an exercise for the reader. Or, there's a Hytimegen() function for y'all to play with in DynaText. -- Steven J. DeRose Sr. System Architect Electronic Book Technologies, inc. Newsgroups: comp.text.sgml Date: 14 Oct 1994 23:25:07 UT From: Earl Hood \ Organization: Engineering, Convex Computer Corporation, Richardson, Tx USA Message-ID: <37n40j$715@imagine.convex.com> References: <37m3ns$j7c@hermes.achilles.net> Subject: Re: DTD2HTML [Erik Ableson] | I recently ran across a msg containing a reference to a utility called | DTD2HTML, designed to simplify the process of converting SGML data into | a format suitable for use with a WWW server. | | If anyone can point me to a site that has this software (or the mfg, if | it's commercial) it would be appreciated. dtd2html does not convert SGML documents into HTML. What it does do is convert a DTD into an HTML document. It allows hypertext navigation of the structure defined by the DTD, and the ability of adding element and attribute descriptions into the HTML document. dtd2html is free, and the latest information on its availability is at \. If you are designing a DTD, or you want to learn about an existing DTD, dtd2html can prove to be useful. --ewh Newsgroups: comp.text.sgml Date: 15 Oct 1994 00:30:37 UT From: Peter Flynn \ Organization: University College Cork Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> \ \ Subject: Re: Storage Models for SGML data [Steven J. DeRose] [Canonical references] | In DynaText(tm), use the TEIgen() property-value function. It | generates a TEI-conformant extended pointer to whatever element you | specify. It supports three variants of the TEI syntax: Either TPATH or | PATH from version P1 of the TEI guidelines (now replaced), or the P3 | (current) form. P3 is the default; P1 TPATH is almost exactly like | your example (as of course you know, Peter). teigen() is smart enough | to use the nearest ancestor with an ID and then work down the tree path | from there, so you get as robust a pointer as possible. So if I'm browsing a text and I find a phrase I want to cite in my thesis, I can call TEIgen to provide me with the chapter and verse? | If you're an end user you can't do a lot with the pointer other than | display it unless your publisher gives you a place to store it and use | it later. But if you have a document using extended pointers, you can | set up links in your stylesheet easily to interpret it, using our tei() | function. The temporary snag is that tei() only supports P1 syntax so | far. My guess is that in 99% of the cases, display is all that is needed: oh, maybe snip it out onto the clipboard or paste it into some edit window. P3 would be needed at some stage, of course :-) This is good news, Steve, thanks...how long has this been available in DynaBook? $64,000 question: ultimately, a lot of the texts we have will need to be available thru WWW. We (and some other people) are working on various methods of converting/exporting/subsetting TEI text into HTML (losing content markup, of course, but adding a useful access facility). To make some form of reference back to the original SGML involves links, which are not a problem to construct, but to perform useful work with them, the WWW browser must call a script which will run a search engine and return the results (reformatted into HTML on the fly). Currently I'm using PAT to do this because it works in line-mode. Do you have (would it be possible to make) a line-mode version of DynaBook that could run silently from a script to do this kind of thing? Have a look at http://www.ucc.ie/curia/texts/oengus.html to see what I mean. ///Peter -- Peter Flynn | pflynn@curia.ucc.ie | ...persuade users that spreading Computer Centre | +353 21 276871 x2609 | fonts across the page like peanut University College | +353 21 277194 (fax) | butter across hot toast is not the Cork, Ireland | Opinions are my own. | route to typographic excellence... Newsgroups: comp.text.sgml Date: 15 Oct 1994 00:53:30 UT From: Howard Kaikow \ Organization: MV Communications, Inc. Message-ID: \ References: \ Subject: Re: Correction re: SGML Author for Word [Christopher R. Maden] | I just spoke with Dale Waldt of \, concerning Microsoft's SGML | Author. I was incorrect in my interpretation of his article. | Regarding the debate on auto-numbering, Microsoft tells him that | anything Word can do, SGML Author for Word can do. That is, since Word | can auto-number, SGML Author can as well. In my previous comments, I assumed that. However, Word has no way to distinguish more than 1 type of heading, i.e., one for the normal use and another for dictionary style definitions, so a conforming SGML input file could not be properly processed by Word and Word could not out the equivalent file, if it is necessary to distinguish such tagged elements. Remember that I am not talking about hard coding things in Word. I am assuming full auto numbering and table of contents creation. Newsgroups: comp.text.sgml Date: 15 Oct 1994 06:42:10 UT From: Richard Wal Matzen \ Organization: Oklahoma State University, Computer Science Message-ID: \ Subject: Re: Anyone have any info about WORD's recent SGML release? [Mark Walter] | Sorry if I implied it exports style info. It does not. Microsoft | believes that's what RTF is for :-) [W. Eliot Kimber] | ... In other words, the purpose of Word's SGML tools is not really to | allow you to export any arbitrary Word document in some sort of | equivalent SGML form, but to allow you to create in Word documents that | conform to a specific document type. The same is true of Word | Perfect's Intellitag. A more complete solution would be: allow you to create WORD (or WP) documents that conform to a specific document type, export them as SGML document instances, ... AND export the style information in an efficient and useful manner (associated with the DTD or document instances). [Mark Walter] | In simple terms, it maps Word document template styles to SGML elements | and attributes,... I dont understand yet exactly what this mapping implies. Does it mean that the user can map a style to an element type (or an element in context), so that when the SGML document instance is output, all text in the WORD document that is associated with (attached to) that style is output as #PCDATA within an occurrence of the element? If this approach is used, then you could export the style's formatting information manually (FOSI like, for each such element/element in context). It should also then be feasible to output it automatically, either in an external definition similar to FOSIs or with some other scheme. I understand that there is more complexity to the problem than a simple one style to one element (or to one e-i-c). However, the above approach would be useful, and I can't yet see why it could not coexist with methods for other constructs (such as structural elements that do not have any associated style). [W. Eliot Kimber] | The Microsoft people impressed me with their candor and their | understanding of SGML and the limitations of their product. They | appear to have avoided the mistakes of their predecessors and have not | tried to oversell what they're offering.... I am anxious to see their product. This level of commitment from Microsoft is a significant event. It is also good to hear that they are sensitive to the problem of overselling SGML tools, which has been a nagging industry problem. The formatting export issue still remains a problem. Yes, there is RTF, but although its improving, it still lacks a good formal definition, and it will probably always be a moving target. Hopefully, Microsoft will will eventually offer a better and more direct solution. Rick Matzen Newsgroups: comp.text.sgml Date: 15 Oct 1994 12:18:16 UT From: Rick Jelliffe \ Message-ID: \ Subject: Abandon ASCII for EBCDIC now! (Was Re: multilingual HTML, SGML) [Liam R. E. Quin] | Don't rule out Unicode or ISO 10646 because you don't have a 32-bit | computer! [Richard L. Goerwitz] | Again, this has nothing to do with what I was saying - although it's a | perfectly valid point to make here. If we are going to demand that all these foreigners who have mad character sets all adopt a unified character system, lets first adopt EBCDIC instead of ASCII ourselves. That way we will inflict on ourselves the same kind of needless transliterations that we ask of them: it is only fair. But it will make software so much easier to write... -ricko -- Rick Jelliffe email: ricko@allette.com.au Allette Systems phone: 2-262-4777 Sydney, Australia fax: 2-262-4774 Newsgroups: comp.text.sgml Date: 15 Oct 1994 19:18:43 UT From: Betty Harvey \ Organization: Advanced Information Systems Branch, DTMB, CDNSWC Message-ID: <37p9ujINNo53@oasys.dt.navy.mil> References: \ \ <9410122107.AA2519@notes.microstar.com> Subject: Re: Microstar [Chris Biber] | Being able to visualize the structure of your DTD is in itself of great | benefit to document authors and DTD creators. Being able to visually | _create_ the DTD brings measurable productivity increases to the | process of DTD design. | | NEAR & FAR's interaction with a central tag and structure library (CADE | Groupware) creates a complete document analysis and design environment. | | I would appreciate any feedback from N\&F users out there - Case | studies, problems encountered, suggestions are all welcome. Well since you asked |-). I think Near and Far is a great product. I haven't gotten to the point where I can 'read a DTD like a novel' yet so I need all the help I can get. I not only use it to help me navigate and understand the content models, I also use it to modify existing DTD's. It really is a wonderful tool. However, there are a few improvements I would like to see made. These are three improvements I would like to see made off the top of my head. 1. When exporting the internal Near and Far model to an ASCII DTD allow the option of keeping the Elements and its Attributes together in the file. 2. The reports function is good but is limited in its current implementation. The printing functions is too slow. Allow the reports to be exported into other formats, namely ASCII. I often compare several DTD's or iterations of DTD's and use other software tools, i.e., database or spreadsheet. I had to install an ASCII printer then print to a file (still slow). 3. Have a means to view the entire DTD hierarchy on a single screen or have a smaller window with the entire hierarchy so you can see where you are. Betty -- Betty Harvey \ | David Taylor Model Basin Advanced Information Systems Branch | Carderock Division Code 183 | Naval Surface Warfare Bethesda, Md. 20084-5000 | Center | DTMB,CD,NSWC URL: http://navysgml.dt.navy.mil/betty.html | Newsgroups: comp.text.sgml Date: 15 Oct 1994 21:04:01 UT From: Kent Summers \ Organization: Electronic Book Technologies, Inc. Message-ID: <37pg41$g1a@spock.ebt.com> References: <53@direct.iaf.nl> <1994Oct14.184628.21488@tin.monsanto.com> Subject: Re: Why SGML and NOT Acrobat [Joel Finkle] | Personally, I'd like to see a hybrid file format that can serve both | masters: | | 1) A page-turner application that can display the full appearance of | the original paper document (that regulations still require) | | and | | 2) A structured document editor that allows me to search and extract | the portions relevant to me why stop there? sun micro and ebt announced at seybold: co-development of a "universal document viewer" that can handle (browse, search, etc.) a wide variety of document formats --> SGML, HTML, dynatext e-books, answerbook, SDL (CDE's format), PDF, among others, will be supported. Newsgroups: comp.text.sgml Date: 15 Oct 1994 21:45:20 UT From: Erik Naggum \ Organization: Naggum Software, +47 2295 0313 Message-ID: <19941015T214521Z.enag@ifi.uio.no> References: \ Subject: Re: Abandon ASCII for EBCDIC now! (Was Re: multilingual HTML, SGML) [Rick Jelliffe] | If we are going to demand that all these foreigners who have mad | character sets all adopt a unified character system, lets first adopt | EBCDIC instead of ASCII ourselves. | | That way we will inflict on ourselves the same kind of needless | transliterations that we ask of them: it is only fair. But it will | make software so much easier to write... It won't be as easy to move from ASCII to UCS-2 as you appear to think. Because of hardware programming languages like C and its derivatives, it will require much arduous work before we get there. Had people used software programming languages with real and distinct `character' types, most of the stupidity could have been avoided. Character encoding is non-trivial. Internal and external representations frequently differ in the real world. Codes _always_ have context dependent semantics. Character strings are frequently external representations of other objects which make such strings atomic even though they consist of several individual characters. A code is a distinct concept from an integer, even though they can have the same bit representation. The same is obviously true of floating point numbers and pointers (addresses). Some of these problems manifest themselves with unprecedented force in text processing applications. Languages to be used in such environments should make an all-out effort to handle the complexities involved, and should not catch a ride with hardware programming languages and their implementations of the character concept. With a suitable level of abstraction, it no longer matters which encoding is used. The folly of insufficient abstraction from the encodings has made application programs susceptible to all kinds of easily avoidable ills. It matters little whether the encoding is 8 or 16 or 32 bits wide, or whether it is encoded as a multiple of 8-bit or 16-bit units, or with a multitude of character sets in various code spaces. Characters have meaning independent of their encoding, and as long as a program can agree with itself on an internal encoding that it can find useful, communication with the outside world can occur on any premises whatsoever. General understanding and application of these principles will have to precede deployment of character sets. I don't think the current trend in character set encoding and character representation in simple-minded programming languages will make this easier or harder for any particular character set. #\ -- Microsoft is not the answer. Microsoft is the question. NO is the answer. Newsgroups: comp.text.sgml Date: 15 Oct 1994 22:21:35 UT From: David J Birnbaum \ Organization: University of Pittsburgh Message-ID: <37pklf$5ji@usenet.srv.cis.pitt.edu> References: <3762qj$4p6@finnegan.iol.ie> Subject: Re: YASP, does anyone have a MS-DOS copy? [Sean Mc Grath] | Either which way, a DOS binary of YASP compiled using the existing OS/2 | hosted compilation process would be very useful. Er ... so would a native OS/2 binary. I have an OS/2 box, but I don't have the requisite compiler. Sigh. If an OS/2 binary appears, I'd be grateful for a pointer to it. Thanks, David -- Professor David J. Birnbaum djbpitt+@pitt.edu The Royal York Apartments, #802 3955 Bigelow Boulevard voice: 1-412-687-4653 Pittsburgh, PA 15213 USA fax: 1-412-624-9714 Newsgroups: comp.text.sgml Date: 16 Oct 1994 15:13:57 UT From: Gavin Nicol \ Organization: Electronic Book Technologies, Inc. Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> \ Subject: Re: Storage Models for SGML data DynaWeb(tm) will initially support 3 methods of sub-document addressing: \ \ \ where "nnnnn" represent an element ID within DynaText. For those who don't know, DynaWeb(tm) allows people to publish DynaText(tm) books on the Internet with little additional effort. DynaWeb(tm) appears as an HTTP compatible server to WWW users, with some extensions to the normal server functionality. Newsgroups: comp.text.sgml Date: 16 Oct 1994 23:25:55 UT From: Erik Naggum \ Organization: Naggum Software, +47 2295 0313 Message-ID: <19941016T232555Z.enag@naggum.no> References: <53@direct.iaf.nl> <1994Oct14.184628.21488@tin.monsanto.com> Subject: Re: Why SGML and NOT Acrobat [Joel Finkle] | Displaying an SGML doc on the fly is slow, especially when you want to | see something in the middle. Having just a page-turner limits the | content-based retrieval. A cross-breed is what I'd like to see. I do not write page-turners or formatting applications. However, I know about programming and about smarter ways to do things than linear search, which many non-programmers seem to assume for SGML. If I were to write a page-turner or formatting application, I would read the entire SGML document once and build an internal representation that is better suited to my needs than a linear representation. (Lisp interpreters do not process long lists of nested parenthetical expressions, they process an internal representation built by the lisp reader function from the external, linear representation -- similarly for SGML.) One easy way to accomplish this internal representation is to use "markers" into the external (linear) representation and then maintain the necessary environmental information in the internal representation. Given this, it should be relatively comfortable to move to a particular point in the SGML document and read data from there, and one would know enough about the history of the parse at this point to do intelligent things quickly. This is not trivial business, but PostScript is not trivial, either. The key is to use languages that enable such processes comfortably. (Comfort and simplicity are not necessarily related.) We observe that PostScript is not comfortable enough for the page-turners, so they created PDF. A not dissimilar effort was started years ago with the SGML-B project, to obtain a binary representation for SGML. For single-purpose languages, it makes sense to standardize an external representation of the processing state just prior to the `showpage', making the `showpage' very quick later. For multi-purpose languages such as SGML, it makes much less sense, primarily because there are no clear points in the processing when it is obviously useful to save the system state. Therefore, SGML systems must of necessity invent their own internal representations and their own optimized external representations of this information. That they have to invent their own format is not an argument for not doing it and instead using linear search through a long character stream to find things. #\ -- Microsoft is not the answer. Microsoft is the question. NO is the answer. Newsgroups: comp.text.sgml Date: 17 Oct 1994 01:34:49 UT From: Peter Kerr \ Organization: School of Music University of Auckland Message-ID: \ References: \ <1994Sep30.024603.29919@midway.uchicago.edu> <1994Oct13.184655.26716@sq.sq.com> <1994Oct14.051527.25155@midway.uchicago.edu> Subject: Re: Multilingual HTML, SGML documents? [Richard L. Goerwitz] | Okay, now try fully cantillated Hebrew, or Quranic Arabic, or (better | yet) Akkadian cuneiform. Or just try to run an import / export / shipping / publishing / etc business in the few square miles between Portugal - Hungary - Norway. ISO 10646 is at least an attempt to standardise glyph mapping. Now a *language attribute* might permit context sensitive intelligent glyph mapping, and that would be very useful too. But what would keep the customers happy in the meantime is that their existing document writing reading and transport software should be able to use a standard glyph map (any standard!) and keep it intact, from the finger-keyboard interface thru to the monitor-eyeball interface. Of course I realise that this problem is not confined to HTML / SGML. -- Peter Kerr bodger School of Music chandler University of Auckland neo-Luddite Newsgroups: comp.text.sgml Date: 17 Oct 1994 03:10:07 UT From: Mike Austin \ Message-ID: \ Subject: Variables in HTLM documents? I've got down general HTML and forms, but it _looks_ like there is a way to define variables in a document. I all want to do is define an ftp directory that I use over and over. Thanks! -- cout << "cman@netcom.com" << endl; << I get more junk e-mail than junk mail! >> Newsgroups: comp.text.sgml Date: 17 Oct 1994 06:20:38 UT From: Sami-Jaakko Tikka \ Organization: Helsinki University of Technology, CS lab Message-ID: <37t53mINN1pb@tahma.cs.hut.fi> Keywords: sgml, ventura, zander tagwrite, html, dtd Subject: Info about Ventura 5.0 and Zander Tagwrite? Hello! First you should know that I am a novice about all things concerning SGML. I am supposed to begin converting a newsletter to HTML format. The newsletter is written with Ventura 5.0 and together with the fact that Ventura comes equipped with an SGML program called Zander Tagwrite leads me to believe that the conversion could be possible. I am thinking that perhaps it is possible to feed the Ventura document together with HTML DTD to the Zander Tagwrite and have it produce HTML for me. Now does anyone know the addresses or other contact info for the authors of Ventura or Zander Tagwrite? I'd like to ask them if the procedure I've described above is possible. Or is anyone of you able to comment on that? Additionally, does anyone know of similar SGML converters for Unix, Windows and Mac platforms? I mean like converters that can read some word processor format and a DTD and produce another format from that. The DTD I'm thinking about would be that of HTML. Finally, am I correct in supposing that DTD is a well defined format and that any converter program claiming to do SGML should be able to read a text file describing the DTD and produce something reasonable? Thank you in advance. -- \Sami Tikka\ "Peace and Long Life." "Live Long and Prosper." Newsgroups: comp.text.frame,comp.text.sgml Date: 17 Oct 1994 09:51:40 UT From: Steve Holden \ Organization: Scotia Electronic Publishing Message-ID: <782387500snz@scotia.co.uk> References: <37ljg6$26l@elna.ethz.ch> Subject: Re: SGML-related Frame Products: no university licensing?! [Hans Ulrich Wiedmer] | I am told by a local Frame distributor that there is no university | licensing available. This holds for every Frame Product related to | SGML (i.e., FrameBuilder, SGML Toolkit, SGML Utilities etc.). It | sounds like this is just the policy of Frame Technology Inc. | | DO OTHER UNIVERSITIES EXPERIENCE THE SAME SITUATION ?????? | | - if yes: is there an understandable reason for Frame Technology to do | so? Well obviously I can't speak for Frame. I ran the first reseller in the UK for about five years. During that five years I spent quite some time encouraging Frame to compete in the UK academic market against a very effective InterGrief promotion where licences were available at nominal cost. | - if no: could anybody give me an address of a Frame distributor in | the "neighbourhood" (whatever that may mean, let's say Europe). Frame steadfastly ignored such advice [their privilege] for about four years. Just after I spent three weeks in intensive negotiations with a major UK university they announced [without having told their resellers first] that a deal had been arranged with a UK body called CHEST, where licenses would be available for less a small fraction of the standarprice. But resellers were cut out of this deal due to low margins! I understand that dealers in the UK can now sell to academic sites at the special pricing. My advice would be to get your resellers to apply pressure to Frame's European distribution staff based in Dublin, and threaten to go to a UK reseller. Experience shows that Frame's attitude to pan-European dealing was, shall we say, eccentric -- and may not comply with various EU directives! CAVEAT: I am no longer involved in reselling FrameMaker or its derivatives. -- Steve Holden Scotia Electronic Publishing +---------------------------+ Tel: +44 436 678962 29 John Street | Tools, Training and | Fax: +44 436 677814 (bureau) Helensburgh | Technology for Successful | Internet: steveh@scotia.co.uk Scotland | Electronic Publishing | Compuserve: 100343,2205 G84 8XL +---------------------------+ Newsgroups: comp.text.sgml Date: 17 Oct 1994 10:09:44 UT From: Markus Kuhn \ Organization: Student Pool, CSD, University of Erlangen, Germany Message-ID: <37tih8Eemq@uni-erlangen.de> References: \ <19941015T214521Z.enag@ifi.uio.no> Subject: "was: Abandon ASCII for EBCDIC now!" \ [Erik Naggum] | It won't be as easy to move from ASCII to UCS-2 as you appear to think. | | Because of hardware programming languages like C and its derivatives, | it will requrie much ardous work before we get there. Had people used | software programming languages with real and distinct `character' | types, most of the stupidity could have been avoided. However, there is a wonderful practical compromise between the ASCII compatibility some systems (especially C/UNIX) require and the need for a 16- or even 31-bit characterset: The UTF-8 encoding of ISO 10646. It is simple, easy to remember, ASCII compatible and has a lot of nice properties. For those interested, please have a look at ftp.uni-erlangen.de pub/doc/ISO/charsets/utf-8.c (contains all the documentation and some simple demo code). As SGML is a _very_ text file (ASCII) oriented system (well, in pure theory it is not, but the whole design of SGML is focused on allowing to use editors like vi and emacs for SGML file handling), using UTF-8 instead of UCS-2 would be a consequent continuation of this design goal. Markus -- Markus Kuhn, Computer Science student -- University of Erlangen, Internet Mail: \ - Germany WWW Home: \ Newsgroups: comp.text.sgml Date: 17 Oct 1994 10:38:03 UT From: Piero Mancino \ Organization: Ericsson Telecom Message-ID: \ References: <9410140838.AA04551@lars.texcel.no> Subject: Re: SGML and the EEC There has been an ODA conference last June in Brussels in which 2 persons from EC attended. They were directly involved with the use of ODA in the EC. Here are their addresses, if you need more information you can e-mail me. I hope this will help, /Piero Ken Thompson (Legislation and Standardization and Telematics Network) Tel. + 32 2 296 89 92 Fax + 32 2 296 95 00 X.400 C=BE; A=RTT; P=CEC; O=DG3; S=THOMPSON; G=KEN Jean Piette (Informatics Directorate-Rules and Coherence for Informatics) Tel. + 352 4301 33677 Fax + 352 4301 33869 X.400 C=be; A=rtt; P=cec;OU=di;S=piette Newsgroups: comp.text.sgml Date: 17 Oct 1994 10:53:05 UT From: Peter Witt \ Organization: Ericsson Telecom Message-ID: <37tl2h$jvt@erinews.ericsson.se> References: <36p76b$c1@erinews.ericsson.se> Subject: Re: SGML to Postscript formatter I have been off the net for a week and I see now that I have got great response on the high speed formatting issue for SGML to Postscript. I have included the articles mailed on this subject with my comments/more information below. [Bob Agnew] | There are two prevalent standards for the formatting of SGML tagged | data, FOSIs and DSSSL. FOSIs are governed by the outspec DTD in | Mil-Std-28001B Appendix B. FOSIs are supported by Arbortext and | DataLogic. DSSSL? [Peter Flynn] | What about SGML-->TeX-->PostScript? I wrote a pilot SGML2TeX which you | can pick up by ftp in pub/sgml on curia.ucc.ie. At least using TeX | overcomes any portability problems. Peter, I will get program to test it here, but first I have to find out how I can fecth files with mail ftp. I have not tried it before so this will be my first attempt. I have no direct access to Internet from where I work, but our company has a central node where I need to have an account on. What more do I need after I get a TeX output from the SGML2TeX? A TeX to Postcript formatter? How do you specify the formatting data for SGML2TeX? FOSI? [Warren Baird] | The system we're using here consists of a shell script that calls sgmls | and sgmlsasp, and optionally calls a perl script with the output of | sgmlsasp to polish the output. | | We either go directly from SGML to HTML for online viewing, or go from | SGML to LaTeX, and then from LaTeX to PostScript for printing. | | Going directly from SGML to Postscript is feasible, but probably not | practical. Why not? OK ;-), we don't have to go all the way to postscript, but the other machines we are using expects postscript. For printing of ps on our high volume production XEROX printers we use Fibre 600 software from ENTIRE to create IMG formatted documents that can be printed. | With our system, going from SGML to HTML or LaTeX is quite fast, but | going from LaTeX to PS is as slow (or as fast) as running LaTeX | normally is. [Peter Bomberg] | since you did not mention the products that you have looked at I am not | sure if I'm just giving you old information. | | I have looked closely at using one of four products. CAPS (xerox), | Pager (datalogics), I6sgml (interleaf) or arbortext publisher | (arbortext). They all have pro's and con's but if you wan't any | specific info then just ask :} | | ps. What products have you looked at? (and what did I miss in my | reaserch) ;-) I have looked at the following products: Product and supplier: Formatting data: ------------------ ---------------- SGML Translator from IBM proprietary Parlance Publisher from XY Systems proprietary DL Composer from Datalogics FOSI CAPS from XSoft (Xerox) proprietary (FOSI will soon implemented they say) ArborText ADEPT series from ArborText FOSI As you see no one is talking about DSSSL and have'nt heard from my contacts at the above suppliers that DSSSL will be considered either. [John Vernon] | Can you give me any contact details for any of these two companies? I | am looking for an SGML to Postscript converter. | | The main concern is ease of use, (e.g., FOSI defined style info is | fime, but something simpler in addition would help), performance and | availability across platforms. | | Can you point advise me where I might find such a converter. Newsgroups: comp.text.sgml Date: 17 Oct 1994 12:22:31 UT From: James Clark \ Organization: None, London, England Message-ID: \ Subject: a new SGML parser \ available for testing I have been writing a new SGML parser, tentatively called SP, over the last couple of years. My main goal was to provide a solid base for the SGML applications that I wanted to write in the future. But I think it now has sufficient functionality that it may be useful to others. So I've decided to make it available for alpha-testing. The code is copyrighted under the same terms as X11R6; these allow commercial use. Here are some of the features of SP: - Written from scratch in C++. - Supports any concrete syntax allowed by the standard. - Supports all varieties of LINK (but the length of chains of LINK processes is limited to 1.) - Reentrant: a single process can use multiple parsers at the same time. - Includes application (nsgmls) that generates sgmls output format and is command-line compatible with sgmls. - Includes application to generate RAST output format. (Converting sgmls output to RAST format isn't sufficient for testing LINK.) - Supports large character sets. It can be compiled to use 16-bit characters internally (32-bits is also a possibility for the future). This allows the entity manager to deal with all the messy details of multi-byte encodings. System identifiers can specify which coding scheme a file uses. Supported coding systems include UTF-8 and Unicode/UCS-2 (intended for use with ISO 10646) and UJIS/EUC and Shift-JIS (for Japanese character sets). - Fast for large documents. For example, on my machine (a sparc 10), it processes the 7Mb Exoterica test document (lcrw.sgm) roughly twice as fast as sgmls. On the other hand, it's slower than sgmls at parsing prologs. It also uses much more memory than sgmls. I haven't looked at its performance on other machines, and it may well be worse. It doesn't support any of the following: - CONCUR (it will parse documents that use CONCUR but you can't make document types active). - DATATAG (support for arbitrary short references makes DATATAG of limited usefulness in my view.) - CAPACITY checking. I would emphasize that this is an alpha release. This means: - It hasn't been tested very much (although I have run it on one large SGML test suite). - There isn't much documentation. In particular, there's no documentation on the parser interface. On the other hand, by reading the header files and looking at the included applications, you should be able to figure out how to use it. - I haven't done much release engineering. There's no fancy configure script. You may well encounter difficulties building it if your system differs from mine. - The parser is still under active development. All of the interfaces are subject to change. My development platform is a Sparc 10/Solaris 2.3 with gcc 2.6.0. Earlier versions of gcc will not work. An iostreams library is needed (such as is included with libg++). You should have little difficulty building on SunOS 4.1, since I only recently moved from that to Solaris. I have also compiled it with Sun C++ 4.0, but I wouldn't recommend using this. The code is in theory quite portable, but it uses templates very extensively, and the mechanisms for instantiating templates vary greatly between compilers, and furthermore many template implementations are rather buggy. So I would recommend using gcc if you can. The source code uses long filenames, so you will not be able to build it on MS-DOS/Windows. (If you want to attempt to build an MS-DOS/Windows version I suggest you try building a Win32s version on Windows NT.) You can get it by anonymous ftp from ftp.jclark.com:/pub/sp/sp.tar.gz. Binaries for a sparc running solaris 2.3 are available in the /pub/sp/sparc-sun-solaris2.3 directory. Man pages are included in the main distribution but are also available separately in the pub/sp/man directory for the benefit of those using the binaries. I would be happy to provide space for binaries for other architectures. I would very much like to hear about any bugs you find. I am also interested in hearing about any places where the code does not conform to the current ANSI C++ working paper. At this point, I'm less interested in hearing about bugs in compilers that prevent them from compiling SP. Although the code is all new, the parser uses many ideas from Charles Goldfarb's ARCSGML parser and the entity manager uses a number of ideas from Erik Naggum's POEM. I thank them both. James Clark jjc@jclark.com Newsgroups: comp.text.sgml Date: 17 Oct 1994 13:07:26 UT From: Jacques Deseyne \ Organization: EuroKom Conferencing Service Message-ID: <1994Oct17.150726.14369@eurokom.ie> Subject: Re: SGML and the EEC I don't think that the EEC is dropping SGML in favor of ODA. Indeed, a few projects are set up to test ODA for communication of documents, but this doesn't mean that investments in SGML will be given up. On the contrary. The IDA program (Interchange of Documents between Administrations) has one project where conversion of documents in ODIF (ODA Interchange Format) will be tested out. This is for really _BLIND_ exchange without agreements on structure, which is what ODA was invented for. Information on this project should be available from Mr. Guy Rossignol, European Commission, DG III/B/5, Office R3 7/21, 200, Rue de la Loi, b-1049 Brussels, Fax. +32 2 299 02 86. The IDA program is also involved in using SGML! A whole bunch of SGML applications are set up at the EEC and will be extended to other environments of users. New applications will be set up. As an example, the Publications Office is quite involved in using SGML for paper-based and electronic publishing. I don't think ODA fits their needs. The best information point is probably the Open Information Interchange initiative, part of the Impact program. I don't find the contact details right now, but it is in Luxembourg. A few initiatives from the EEC have invested in the development of the ODA standard and continue to promote its use for specific purposes. From the implementation point of view, the number of SGML-based installations far exceeds the installations using ODA. We were involved in testing out both SGML and ODA for conversion between different word processing formats at the EEC, in the context of the "EuroLook" documents. EuroLook is a set of structured documents which can be created in customized word processor environments. It appeared from these experiments that current ODA converters lose all structure information (i.e., none takes the Generic Logical Structure, which is more or less the ODA equivalent for a DTD, into account). Only the converter from Bull (RTF to ODIF) performed somewhat better. For the SGML conversions, it appeared that you need high-quality conversion environments (such as OmniMark, FastTag, ...) for good results. Nothing in my contacts with a lot of people at the Commission showed anything which would lead to conclude that the use of SGML would be dropped in favor of ODA. The nature of this rumor seems to be inspired by the age-old hostility which appears to exist between SGML and ODA adepts. Little can be done to appease the animosity, but every side should be aware that both standards were conceived for really different purposes. SGML is probably well known to the readers of this news group; ODA was conceived for the blind exchange of documents in an OSI environment (its acronym meant originally "Office Document Architecture", later changed to "Open Document Architecture"). I would be happy to know the source of these rumors! -- Jacques Deseyne jdeseyne@eurokom.ie Avenue Louise 250, box 14 B-1050 Brussels Newsgroups: comp.text.sgml Date: 17 Oct 1994 14:17:55 UT From: Paul Hermans <71140.2011@compuserve.com> Message-ID: <941017141755_71140.2011_HHE36-1@CompuServe.COM> Subject: SGML BeLux -- new address Pro Text, chair of the Belgian-Luxembourgian chapter of the International SGML Users' Group, SGML BeLux vzw, has moved to: Interleuvenlaan 62 3001 Heverlee +32/16/40 66 81 (voice) +32/16/40 66 91 or 40 01 35 (fax) Newsgroups: comp.text.sgml Date: 17 Oct 1994 15:05:18 UT From: Christophe Espert \ Organization: A poorly-installed InterNetNews site Message-ID: <37u3re$72d@edf3.der.edf.fr> Subject: YASP compilation on Unix platforms \ Regarding YASP's compilation on Unix platforms, if you encounter some problems with the GNU GCC compiler, read this. You may encounter some problems compiling the server.c file necessary for the SMP00 and the RAST applications. You should use g++ with the -x c option. This will imply a correct linking of libraries. Moreover you may have some problems at linking time with the fsetpos and fgetpos functions. You can replace them with ftell and fseek respectively. Scan through the server.c file and find the lines where fsetpos and fgetpos are used, then do this: Replace if ((rc = fsetpos(f_p->fd, \&posblk_p->fpos)) != 0) { with if ((posblk_p->fpos = ftell(f_p->fd)) < 0) { rc = 0; and replace if ((rc = fgetpos(f_p->fd, &(sra_p->posblk.fpos))) != 0) { with if ((rc = fseek(f_p->fd, sra_p->posblk.fpos, SEEK_SET))) { Finally add the following statements right after the #if (defined(M_I86)) #endif statements at the top of the file: #if (defined(GNUCC)) #define FILENAME_MAX 512 #define fpos_t long #define SEEK_SET 0 #endif I hope it is all clear :-) Good luck. Best regards, -- Christophe Espert - E-mail: espert@cln46fw.der.edf.fr EDF - DER | High Text 1, Avenue du General de Gaulle | 5, rue d'Alsace 92141 Clamart CEDEX - FRANCE | 75010 Paris - FRANCE Tel: 33.1.47.65.43.21 ext. 6635 | Tel: 33.1.42.05.93.15 Fax: 33.1.47.65.50.07 | Fax: 33.1.42.05.92.48 ISO 8879:1986 - SGML | ISO/IEC 10744:1992 - HyTime | ISO/DIS 10179 DSSSL Newsgroups: comp.text.sgml Date: 17 Oct 1994 15:22:06 UT From: "Wayne L. Wohler" \ Organization: IBM Information Development Strategy and Tools Message-ID: <37u4quINNi50@afshub.boulder.ibm.com> References: <37htslINNogp@afshub.boulder.ibm.com> <19941013T181021Z.enag@ifi.uio.no> Subject: Re: function character parsing [Erik Naggum] The MSxCHAR function characters are, as you say, not markup according to the definition in 4.183 so one cannot argue they should not be recogized under the provisions you cite. While not a conformance argument, these function chararacters are not useful when they do not work within PIs and comments since authors will certainly want to enter comments in their own langauge and may need to enter characters from their alphabet in PIs. | 9.7 Markup Suppression | | An /MSOCHAR/ suppresses recognition of markup until an /MSICHAR/ or | entity end occurs. An /MSSCHAR/ does so for the next character in the | same entity (if any). | | NOTE -- An /MSOCHAR/ occurring in |character data| or other delimited | text will therefore suppress recognition of the closing delimiter. An | /MSSCHAR/ could do so if it preceded the delimiter. I would feel better about this citation if there were a definition of delimited text in the standard, which there is not. Given the definition of delimiter characters in 4.86, I think it is fair to say comments and PIs fall into the classification of 'delimited text'. Thanks for the post, Erik. I have received one note offline which agrees that the interpretation you expressed is correct. -- Wayne L. Wohler Internet: wohler@vnet.ibm.com Dept G82/025Z IBMMAIL: USIB29WX@IBMMAIL Information Development Strategy and Tools Phone: +1 303 924 5943 IBM Corporation PO Box 1900 Boulder, Colorado 80301-9191 Newsgroups: comp.text.sgml Date: 17 Oct 1994 16:15:11 UT From: "Wayne L. Wohler" \ Organization: IBM Information Development Strategy and Tools Message-ID: <37u7ufINNi50@afshub.boulder.ibm.com> References: \ Subject: Re: Abandon ASCII for EBCDIC now! (Was Re: multilingual HTML, SGML) [Rick Jelliffe] | If we are going to demand that all these foreigners who have mad | character sets all adopt a unified character system, lets first adopt | EBCDIC instead of ASCII ourselves. Sounds like a good idea to me. :-) (check the signature line if you don't get the joke). Seriously, this isn't about inflicting needless pain, its about finding a way out of the morass of character encodings we have today, in both EBCDIC and ASCII; both of which are largely meaningless terms unless you use only lower case, upper case and a few punctuation characters. However we address this problem (and it is a problem that is growing as the world- wide networks are used more), it will take a LONG TIME to move to the solution. We didn't build the current tottering ediface in a day. Prudent application developers will begin to develop their code independent of the encoding of the data it processes, if they aren't already. This will make it easier to concurrently support the function on today's systems as well as systems developed in the future which use more modern character encodings. That way, you can take your pick, EBCDIC, ASCII, ISO 10646 or roll-your-own. SGML systems have a very natural place to place this function, its called the entity manager. -- Wayne L. Wohler Internet: wohler@vnet.ibm.com Dept G82/025Z IBMMAIL: USIB29WX@IBMMAIL Information Development Strategy and Tools Phone: +1 303 924 5943 IBM Corporation PO Box 1900 Boulder, Colorado 80301-9191 Newsgroups: comp.text.sgml Date: 17 Oct 1994 16:23:53 UT From: "Michael A. Nelson" \ Organization: MSFC Message-ID: \ Subject: SGML Fan want formatting info I've been watching the progress of SGML for several years now and I continue to be surprised by the diversity of its application. I work Management Information Systems here at Marshall Space Flight Center (MSFC) for NASA and we are involved in a lot of configuration management of information (mainly documents). Our procedures for managing the data products is heavily rooted in the ultimate paper format of the data. My question is -- is there a standard approach to formatting SGML tagged information such that I can always gaurantee that everyone working with it sees the same thing? I understand there is something called a Format Output Specification Instance (FOSI). Are FOSI's application unique or is there some kind of a standard that all FOSI's are based upon either formal or de facto? With the limited time that I've had to research this I haven't found one. Please post your responses here in this newsgroup. Thanks -- # Michael A. Nelson (MAN) ... on the moon. # # "Everything'll be all right" said the elephant as she danced # among the chickens. Newsgroups: comp.text.sgml Date: 17 Oct 1994 18:32:34 UT From: "Steven J. DeRose" \ Organization: EBT, Inc. Message-ID: \ References: <53@direct.iaf.nl> <1994Oct14.184628.21488@tin.monsanto.com> Subject: Re: Why SGML and NOT Acrobat [Joen Finkle] | Personally, I'd like to see a hybrid file format that can serve both | masters: | | 1) A page-turner application that can display the full appearance of | the original paper document (that regulations still require) | | and | | 2) A structured document editor that allows me to search and extract | the portions relevant to me I too would like to have the Holy Grail. But this combination is hard*. *No one has yet proposed a workable approach. I would say that if you have *to choose, you're far better off choosing the high ground and attempting *to force-fit formatting into an SGML structure, than to try to fit the *richer descriptive markup structure of SGML into a format file. One key problem is that a page-turner has to support whatever flashy new layout someone dreams up -- flip through the glossiest magazine you have and look at the layout of the pages. If you want your point (1) you have to support that, and you'd better update your format every time PageMaker(tm) adds a new capability. But the biggest problem is that the relationship between (1) and (2) is incredibly complex. It's not just plopping in comments that contain SGML tags (though that is often suggested). Among the simpler problems: a) How do you express the relationship between a footnote (better yet, a page-spanning footnote) and its reference point? b) How do you represent the connection of a story fragment that got moved miles away because of newspaper page constraints ("see 'foo' on colum,n 3, p. 2b"). c) Speaking of newspapers, why would I *want* to have an 18*24 inch page image on my 8*10 inch screen? Do I clip everything at the screen edges, or do I shrink the 7-point type down to 3-point and hope i can read it on a screen with 80dpi resolution? d) What do you do about text and other objects that were generated by the formatting process, or worse yet, that were deleted by it? i) page numbers, headers/footers, automatic list numbers, etc. ii) the 'novice' info when the previous reader was expert, and vice versa. e) What do you do when the font you wanted isn't there? You can either try modifying an existing font to sort of fit, or give up. And if there's a character set difference, you're completely out of luck (which is why page-flipper products store a full copy of any non-Latin1 font in every document file (note that all Mac fonts are non-Latin1!). If you really try to work an example of a substantial document, you'll discover these and scores more problems that must be solved. Pull down a PostScript file from the net and print it out as a PostScript file (not as formatted pages) -- last time i did this i discovered there was not one intACT word of the printed text there, it was all programs to place single letters just so. My sympathies in advance to anyone who tries to combine that with hierarchical descriptive markup structures. | Displaying an SGML doc on the fly is slow, especially when you want to | see something in the middle. Having just a page-turner limits the | content-based retrieval. A cross-breed is what I'd like to see. This, I'm happy to say, need not be correct. Try DynaText(tm). Try it on a 100-MB document. Open it and drag the scrollbar to the middle or end. In my SGML course at HyperText '93 I showed the same document in a whole range of formats from bitmap through RTF, troff, and on to SGML. The closer you get to SGML, the fewer bytes you need (even allowing for compression), and, often, the less processing you need for typical applications. That makes for speed. Now, given all that, I will say there is one method that has been shown to work reasonably well: store 2 completely separate representations, and make them point them at each other. For example, plop markers into the SGML that show where each formatted page starts, and keep the formatted pages around as bitmaps or something. While this does *not* solve the problems i noted, it at least gives a simple, clean representation for the *uncomplicated* cases. The TEI has worked hard on such problems, perhaps because it encompasses experts in both ends of the process: paleographers who care about the exact slant angle of the third "t" in each line of manuscript, and linguists who want to do vocabulary statistics on abstracts versus full-text, as well as a broad spectrum of scholars, implementors, and others between. As is becoming well known, excellent thought and practical solutions to serious markup problems are to be found in the TEI guidelines. -- Steven J. DeRose Sr. System Architect Electronic Book Technologies, inc. Newsgroups: comp.text.sgml Date: 17 Oct 1994 19:33:39 UT From: "Steven J. DeRose" \ Organization: EBT, Inc. Message-ID: \ References: <374l6q$fuv@usenet.rpi.edu> \ \ \ Subject: Re: Storage Models for SGML data [Peter Flynn] | So if I'm browsing a text and I find a phrase I want to cite in my | thesis, I can call TEIgen to provide me with the chapter and verse? In effect; what it will actually give you is a P1 extended locator that starts with the nearest ancestor that has an ID, and then walks down by child numbers to the node you selected. For those who know HyTime, this amounts to a location chain consisting of a nameloc and treeloc (except that you have the option of counting children by GI if you want, so that SECTION 2 is #2, not #3 because of the chapter title, etc.). If you want an ugly but trivial way to view these values, copy your fulltext stylesheet and modify one or more style definitions to put the locator value in before the content, maybe in brackets like here: \