HECnet May 2013

hecnet@lists.dfupdate.se

38 participants
680 discussions

by mcguire＠neurotica.com

I like it. -Dave -- Dave McGuire, AK4HZ New Kensington, PA On May 19, 2013, at 11:03 AM, Brian Hechinger <wonko at 4amlunch.net> wrote: CSV as a data format is actually quite powerful, flexible and simple. In order for it to work, however, we need to agree on some ground rules. The ground rules are simply this: 1) The first line must *always* be the header. This is not optional. 2) The field separator must be chosen. 3) The names of the columns must be decided on. 4) The column names should be case insensative. 5) The columns need to be defined required or optional. 6) This all needs to be documented somewhere. The reason for the header being required is you can do column name to column position mapping. This makes the CSV column layout not fixed. The columns can be in whichever order each person finds appropriate. #2 is pretty much already chosen as | seems to be the most common one (one of my favorites as it rarely appears in text). There are some already commonly used column names at this point, so we should make a list of what those are and which we are going to use. I think the rest are self explanatory. :) Let's discuss!! -brian

12 years, 10 months

CSV Proposal

by ian＠platinum.net

On 2013-05-19, at 9:34 AM, Johnny Billquist <bqt at softjar.se> wrote: I honestly don't know what I think on that subject. Or, actually, I do. I think it would be even better to just have key-value pairs. Why overcomplicate things? Any unknown key is just ignored. We could expand with new keys without having the code synced. And then the values can hold any character except newline. You mean just like a Windows .ini file? (ducks) j/k Ian

12 years, 10 months

CSV Proposal

by robert.jarratt＠ntlworld.com

-----Original Message----- From: owner-hecnet at Update.UU.SE [mailto:owner- hecnet at Update.UU.SE] On Behalf Of Johnny Billquist Sent: 19 May 2013 17:34 To: hecnet at Update.UU.SE Cc: Brian Hechinger Subject: Re: [HECnet] CSV Proposal On 2013-05-19 18:14, Brian Hechinger wrote: On Sun, May 19, 2013 at 05:02:18PM +0100, Robert Jarratt wrote: Well, if the separator is | then it isn't a CSV file :-) While I do understand the origins of the term CSV, the C really doesn stand for character. Tabs are used quite commonly as well as commas. Any character is valid, so long as it's accepted by all consumers of said file. The C standing for comma is antiquated and outdated and should be changed. In my not so humble opinion, anyway. :) He says, to a crowd of people running antiquated and outdated software and hardware... :-D Anyway, while I think I agree that technically, C stands for comma, I also think | is a better separator in this case. Feel free to call it anything. The acronym for the format is less important. We could call it BSV then. I didn't want to start a big debate about this, it was just a little joke. I would be more than happy to use the pipe symbol as the separator. I have no issue with this except that allowing the columns to appear in any order, while nice and flexible, makes it harder to write the software. It does not seem worth the effort to have that flexibility given the low enthusiasm for writing this software, so a simplification would be to fix the column order. It really isn't that hard to write. I've implemented such a thing in many languages (including very stupid ones) and it is *always* worth the effort. It wouldn't be a big effort for me on the platforms I write for today. It would be harder for me on platforms I programmed a long time ago (or even never in the case of RSX) and never having programmed for Datatrieve, but that is precisely why I would like to do it, although I am not at all sure about the RSX hurdle. Is there a language on VAX that would make it easier to port it to RSX? Or can Datatrieve be accessed over DECnet so the scraper code runs on VAX and populates Datatrieve on RSX? I honestly don't know what I think on that subject. Or, actually, I do. I think it would be even better to just have key-value pairs. Why overcomplicate things? Any unknown key is just ignored. We could expand with new keys without having the code synced. And then the values can hold any character except newline. Actually I am thinking that I wouldn't mind writing the program myself, but don't wait for me, I have a host of other things I want to do as well. I'm always open to writing anything. :) Knock one out for RSX? :-) I'll try and get back with some more thoughts about file formats and whatnot in a while. My pizza just arrived... Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: bqt at softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol

12 years, 10 months

CSV Proposal

by bqt＠softjar.se

On 2013-05-19 18:14, Brian Hechinger wrote: On Sun, May 19, 2013 at 05:02:18PM +0100, Robert Jarratt wrote: Well, if the separator is | then it isn't a CSV file :-) While I do understand the origins of the term CSV, the C really doesn stand for character. Tabs are used quite commonly as well as commas. Any character is valid, so long as it's accepted by all consumers of said file. The C standing for comma is antiquated and outdated and should be changed. In my not so humble opinion, anyway. :) He says, to a crowd of people running antiquated and outdated software and hardware... :-D Anyway, while I think I agree that technically, C stands for comma, I also think | is a better separator in this case. Feel free to call it anything. The acronym for the format is less important. We could call it BSV then. I have no issue with this except that allowing the columns to appear in any order, while nice and flexible, makes it harder to write the software. It does not seem worth the effort to have that flexibility given the low enthusiasm for writing this software, so a simplification would be to fix the column order. It really isn't that hard to write. I've implemented such a thing in many languages (including very stupid ones) and it is *always* worth the effort. I honestly don't know what I think on that subject. Or, actually, I do. I think it would be even better to just have key-value pairs. Why overcomplicate things? Any unknown key is just ignored. We could expand with new keys without having the code synced. And then the values can hold any character except newline. Actually I am thinking that I wouldn't mind writing the program myself, but don't wait for me, I have a host of other things I want to do as well. I'm always open to writing anything. :) Knock one out for RSX? :-) I'll try and get back with some more thoughts about file formats and whatnot in a while. My pizza just arrived... Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: bqt at softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol

12 years, 10 months

CSV Proposal

by wonko＠4amlunch.net

On Sun, May 19, 2013 at 05:02:18PM +0100, Robert Jarratt wrote: Well, if the separator is | then it isn't a CSV file :-) While I do understand the origins of the term CSV, the C really doesn stand for character. Tabs are used quite commonly as well as commas. Any character is valid, so long as it's accepted by all consumers of said file. The C standing for comma is antiquated and outdated and should be changed. In my not so humble opinion, anyway. :) I have no issue with this except that allowing the columns to appear in any order, while nice and flexible, makes it harder to write the software. It does not seem worth the effort to have that flexibility given the low enthusiasm for writing this software, so a simplification would be to fix the column order. It really isn't that hard to write. I've implemented such a thing in many languages (including very stupid ones) and it is *always* worth the effort. Actually I am thinking that I wouldn't mind writing the program myself, but don't wait for me, I have a host of other things I want to do as well. I'm always open to writing anything. :) -brian

12 years, 10 months

CSV Proposal

by robert.jarratt＠ntlworld.com

-----Original Message----- From: owner-hecnet at Update.UU.SE [mailto:owner- hecnet at Update.UU.SE] On Behalf Of Brian Hechinger Sent: 19 May 2013 16:03 To: HECnet Mailing List Subject: [HECnet] CSV Proposal CSV as a data format is actually quite powerful, flexible and simple. In order for it to work, however, we need to agree on some ground rules. The ground rules are simply this: 1) The first line must *always* be the header. This is not optional. 2) The field separator must be chosen. 3) The names of the columns must be decided on. 4) The column names should be case insensative. 5) The columns need to be defined required or optional. 6) This all needs to be documented somewhere. The reason for the header being required is you can do column name to column position mapping. This makes the CSV column layout not fixed. The columns can be in whichever order each person finds appropriate. #2 is pretty much already chosen as | seems to be the most common one (one of my favorites as it rarely appears in text). There are some already commonly used column names at this point, so we should make a list of what those are and which we are going to use. I think the rest are self explanatory. :) Let's discuss!! -brian Well, if the separator is | then it isn't a CSV file :-) I have no issue with this except that allowing the columns to appear in any order, while nice and flexible, makes it harder to write the software. It does not seem worth the effort to have that flexibility given the low enthusiasm for writing this software, so a simplification would be to fix the column order. Actually I am thinking that I wouldn't mind writing the program myself, but don't wait for me, I have a host of other things I want to do as well. Regards Rob

12 years, 10 months

CSV Proposal

by wonko＠4amlunch.net

CSV as a data format is actually quite powerful, flexible and simple. In order for it to work, however, we need to agree on some ground rules. The ground rules are simply this: 1) The first line must *always* be the header. This is not optional. 2) The field separator must be chosen. 3) The names of the columns must be decided on. 4) The column names should be case insensative. 5) The columns need to be defined required or optional. 6) This all needs to be documented somewhere. The reason for the header being required is you can do column name to column position mapping. This makes the CSV column layout not fixed. The columns can be in whichever order each person finds appropriate. #2 is pretty much already chosen as | seems to be the most common one (one of my favorites as it rarely appears in text). There are some already commonly used column names at this point, so we should make a list of what those are and which we are going to use. I think the rest are self explanatory. :) Let's discuss!! -brian

12 years, 10 months

Another sillyness. More information in the nodename database on MIM.

by wonko＠4amlunch.net

On Sun, May 19, 2013 at 02:31:25PM +0200, Johnny Billquist wrote: In a way it might be important to remember that my nodename database extensions right now is just a fun project on my behalf without much defined needs. Just like most things people do on HECnet it is totally voluntary, and can be considered a pet project without any real demand. You just described everything we're doing here. :) I have not really seen any need for this stuff, so it's just for my enjoyment. I personally found the INFO.TXT files unsatisfying, so I'm doing something else. If it is meaningful or useful is not my primary concern. It satisfies me. If it later turns out to be useful, that is really cool. The same thing can be said of HECnet in general. I repeat my earlier statement here. :) All that said, yes, I could definitely consider a more strict format of how something like location is defined. However, you will never see me use XML if I can ever avoid it. And it is a really bad fit with older computer systems, as it quickly becomes so big. The same can be said of JSON. If we really want to get fine grained information in fields, I would do that natively in Datatrieve. Why on earth would I use a silly serializable external representation when I already have the data in a database? I am also against XML and JSON, but for mainly the same reason. CSV can and will do everything we need here so why overcomplicate it? We aren't questioning your need for the database or why we want to save things externally. This is more a means to an end. How do we (meaning all of us who's names aren't Johnny Billquist) update said database. Giving us database access is one way, but I'm sure not everyone wants to bother with that. The INFO.TXT file is a great way for those people to get data into the database. The more information is in the database, the more useful the database will end up being. (And while JSON might be easier for a human to read, it's about as unfriendly a XML when it comes to actually manipulating by hand. You need tools, and those tools are also just out of the question on something like a PDP-11. And I will not write my own.) JSON is easier to read when you use it for such things as config files. When you start using it to store rows of data and can very quickly become less readable, at least when compared to CSV. Right. And my main concern is actually that people do not have enough interest to make this work. So it is either having a project by a single individual, or else have something that never takes off. There are at a minimum two people who are interested to make this work. I'm one and you are the other. We've proven over the years that we can be a force to be reckoned with if we so choose. :) I think also it's not about it "taking off". It's something we put out there that people use or don't use. We can't (and don't want to) force anyone into using these tools. We just want them to be available if they do. But that is just what I think. You think too much. :) So at this point I do not believe in a distributed effort of some sort. I also do not believe in doing things in something ugly like XML or JSON. I have a database. I can change the schema of that, if needed. That is the easy part. The hard part is having data *in* the database that is up to date, and in a conformant shape. Neither of those problems are addressed by having a schema as such, nor by using some other representation. I'll repeat here what I said before. The proposal is to simply have a manner of getting data into the database. A properly defined schema for some file that can be parsed is certainly a good idea in my opinion. I will start a new thread for the CSV proposal as this one os getting a tad off track. :) -brian

12 years, 10 months

Another sillyness. More information in the nodename database on MIM.

by wonko＠4amlunch.net

On Sun, May 19, 2013 at 05:30:01AM +0000, Mark Wickens wrote: On 18/05/2013 16:49, Dave McGuire wrote: I've done a lot of database work, going back to Ingres and QUEL. I'm one of those weirdos who actually enjoys databases...I think most people find database work to be dry and boring, but I find it fascinating and stimulating. I've seen DTR applications used in production but have never had any exposure at all to the software...having something to actually *do* with it, like this node database for HECnet, is great stuff, and a great way to learn. -Dave Dave, I avoided databases for a long time, but I kind of enjoy them too, although it often leaves me feeling a little dirty ;) Databases are where you store data. I think that's pretty obvious. :) They can actually be quite fun. I've written an a resume generating program entirely in PL/pgSQL. It takes data out of the database (including templates) and spits out tex. Now *that* was dirty. :) -brian

12 years, 10 months

Another sillyness. More information in the nodename database on MIM.

by bqt＠softjar.se

On 2013-05-19 07:20, Mark Wickens wrote: On 19/05/2013 01:25, Johnny Billquist wrote: After some more thinking on my part, I think I'm going to go slow on this. Yes, having me manage this does not scale. On the other hand, I'm not convinced it needs scaling. If updates really becomes an issue, or if people really starts asking for, and using data from this database, I look at it more as a curiosity and extension of just the basic need I had for a database for node names and node numbers, with a person attached. But I'm very interested in having more of a discussion about what could possibly be done. And I also take donations in form of code... ;-) If the info is to be useful, other than in a TYPE INFO.TXT capacity then it needs to be in a machine readable format with a well defined schema. Two obvious choices are XML or JSON. Whilst I work more with XML in a day-to-day basis JSON is definitely more 'human-friendly'. In a way it might be important to remember that my nodename database extensions right now is just a fun project on my behalf without much defined needs. Just like most things people do on HECnet it is totally voluntary, and can be considered a pet project without any real demand. I have not really seen any need for this stuff, so it's just for my enjoyment. I personally found the INFO.TXT files unsatisfying, so I'm doing something else. If it is meaningful or useful is not my primary concern. It satisfies me. If it later turns out to be useful, that is really cool. The same thing can be said of HECnet in general. All that said, yes, I could definitely consider a more strict format of how something like location is defined. However, you will never see me use XML if I can ever avoid it. And it is a really bad fit with older computer systems, as it quickly becomes so big. The same can be said of JSON. If we really want to get fine grained information in fields, I would do that natively in Datatrieve. Why on earth would I use a silly serializable external representation when I already have the data in a database? (And while JSON might be easier for a human to read, it's about as unfriendly a XML when it comes to actually manipulating by hand. You need tools, and those tools are also just out of the question on something like a PDP-11. And I will not write my own.) I think we've got to this point in this discussion quite a few times - what is the point of taking the effort to get hecnet machine owners to provide this information? From a personal point of view I think a google map with an indication of the links between areas would be great, but there is not really any point in doing this unless we have a good initial uptake and then people keep their info up-to-date. It would be good promotional material, if that's deemed of use. Right. And my main concern is actually that people do not have enough interest to make this work. So it is either having a project by a single individual, or else have something that never takes off. But that is just what I think. I'm happy to define an initial schema based in the existing info file data (looking at how the data currently available could be shoehorned into a tighter schema definition) and we could go from there. Feel free. Don't expect me to use it. :-) However, I'm suspecting that there's probably half a dozen of us that would get on board with this - how many of the area operators are active on the mailing list? Right. The people actively participating in this discussion is less than the number of "responsible" persons. So at this point I do not believe in a distributed effort of some sort. I also do not believe in doing things in something ugly like XML or JSON. I have a database. I can change the schema of that, if needed. That is the easy part. The hard part is having data *in* the database that is up to date, and in a conformant shape. Neither of those problems are addressed by having a schema as such, nor by using some other representation. Sorry. I got a bit carried away after I came as far as XML. :-) Your suggestions and offers to work are most appreciated, Mark. Johnny -- Johnny Billquist || "I'm on a bus || on a psychedelic trip email: bqt at softjar.se || Reading murder books pdp is alive! || tryin' to stay hip" - B. Idol

12 years, 10 months

← Newer
1
...
16
17
18
19
20
21
22
...
68
Older →

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

HECnet May 2013