TCP/IP don't care about the source MAC address. And destination MAC address
is just used to make the packet end up at the right destination, without any
other meaning to it. And the mapping between destination IP address and MAC
address is solved by ARP. And several different IP addresses can use the
same MAC address. Exactly how that MAC address looks like isn't an issue
either. So, for TCP/IP, everything will work fine if you just use the actual
MAC address the card have, even though you might be doing traffic both from
the host, and from a simulator within the host. Reception is a bit worse,
since you need to figure out if the received packet should go to the host or
the simulator.
DECnet works in a very different way, where each host *must* have a unique
MAC address, and where the recipient really checks that the source MAC
address is consistent with what was expected, or else the packet is dropped
(yes, I'm writing and meaning source address here). And sending data is done
to a specific MAC address without ARP. Instead the MAC address is calculated
from the DECnet address, and is thus "known" already. For all hosts on the
network.
Believe me. You are not really seeing or understanding the problem yet, with
the tests you are doing.
Parts are things you need to read up on, on how DECnet works. And parts
you'll only find out the hard way, as I did. :-)
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt at softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
I see... yeah I have a VMS image, but really no idea how to setup it's
networking... I 'used' VMS as a user ages ago, although it was mostly
using EDT and fighting the gold key thing on terminals....
Anyways is there some online docs on setting up VMS's networking? I'd
love to be able to dig deeper into this....
Jason Stevens wrote:
Well, normally libpcap is used to recieve packets, so it don't neccesarily
help with sending them.
Second, sending packets is one thing, sending packets with a "fake" source
MAC address is yet another thing (as a short note, the DEUNA and DELUA
ethernet controllers for the Unibus can never do this. They set the source
MAC address from the controller, no matter what you place in the packet you
want to send).
But please report if you have success with this.
This is what i'm using right now to have two versions of SIMH send
pings to eachother... I added the tcp/ip stuff, and removed all of
the libpcap parts of the code.. I also noticed it uses blocking
sockets so it'll get 'stuck' from time to time reading stuff. I
started work on porting it to windows but I've been away on vacation
so it didn't get any work done...
"ping" sortof implies that you are talking tcp/ip. If so, the test is meaningless.
TCP/IP don't care about the source MAC address. And destination MAC address is just used to make the packet end up at the right destination, without any other meaning to it. And the mapping between destination IP address and MAC address is solved by ARP. And several different IP addresses can use the same MAC address. Exactly how that MAC address looks like isn't an issue either. So, for TCP/IP, everything will work fine if you just use the actual MAC address the card have, even though you might be doing traffic both from the host, and from a simulator within the host. Reception is a bit worse, since you need to figure out if the received packet should go to the host or the simulator.
DECnet works in a very different way, where each host *must* have a unique MAC address, and where the recipient really checks that the source MAC address is consistent with what was expected, or else the packet is dropped (yes, I'm writing and meaning source address here). And sending data is done to a specific MAC address without ARP. Instead the MAC address is calculated from the DECnet address, and is thus "known" already. For all hosts on the network.
Believe me. You are not really seeing or understanding the problem yet, with the tests you are doing.
Parts are things you need to read up on, on how DECnet works. And parts you'll only find out the hard way, as I did. :-)
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt at softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
Well, normally libpcap is used to recieve packets, so it don't neccesarily
help with sending them.
Second, sending packets is one thing, sending packets with a "fake" source
MAC address is yet another thing (as a short note, the DEUNA and DELUA
ethernet controllers for the Unibus can never do this. They set the source
MAC address from the controller, no matter what you place in the packet you
want to send).
But please report if you have success with this.
This is what i'm using right now to have two versions of SIMH send
pings to eachother... I added the tcp/ip stuff, and removed all of
the libpcap parts of the code.. I also noticed it uses blocking
sockets so it'll get 'stuck' from time to time reading stuff. I
started work on porting it to windows but I've been away on vacation
so it didn't get any work done...
Hopefully gmail doesn't screw the formatting up too badly.
---8<---8<---8<---8<---8<---8<---8<---8<---8<---8<
/* A simple DECnet bridge program
* (c) 2003, 2005 by Johnny Billquist
* Version 2.1 Fixed code for OpenBSD and FreeBSD as well.
* Version 2.0 (I had to start using a version number sometime, and
* since I don't have any clue to the history of my
* development here, I just picked 2.0 because I liked
* it.)
* Some more text will come here later. */
#define DEBUG 0
#define MAX_HOST 16
#define CONF_FILE "bridge.conf"
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <sys/time.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>
#include <netdb.h>
#include <signal.h>
#include <string.h>
/* Throttling control:
* THROTTLETIME - (mS)
* If packets come closer in time than this, they are
* a base for perhaps considering throttling.
* THROTTLEPKT - (#)
* The number of packets in sequence that fulfill
* THROTTLETIME that means throttling will kick in.
* THROTTLEDELAY - (uS)
* The delay to insert when throttling is active.
*
* Passive connection control:
* PASSIVE_TMO - (mS)
* If nothing has been received from a passive node
* in this time, sending to it will stop.
*/
#define THROTTLETIME 5
#define THROTTLEPKT 4
#define THROTTLEDELAY 10
#define PASSIVE_TMO 180000L
#define THROTTLEMASK ((1 << THROTTLEPKT) - 1)
#define ETHERTYPE_DECnet 0x6003
#define ETHERTYPE_LAT 0x6004
#define ETHERTYPE_MOPDL 0x6001
#define ETHERTYPE_MOPRC 0x6002
#define ETHERTYPE_IP 0x0800
#define ETHERTYPE_ARP 0x0806
#define ETHERTYPE_REVARP 0x8035
#define MAX(a,b) (a>b?a:b)
/* The structures and other global data we keep info in.
It would perhaps be nice if we could reload this, and
in case of failure keep the old stuff, but for now we
don't care that much... */
/* The data structures we have are the port, which describe
a source/destination for data. It also holds info about which
kind of traffic should be forwarded to this site. It is also
used to filter incoming packets. If we don't send something, we
don't want it from that side either.
We have the host table, which is a hash table for all known
destinations, so that we can optimize the traffic a bit.
When data arrives, we filter, process and send it out again.
*/
typedef enum {DECnet, LAT, IP} pkttyp;
#define MAXTYP 3
struct BRIDGE {
char name[40];
char host[80];
in_addr_t addr;
short port;
int passive;
int fd;
int types[MAXTYP];
char last[8][14];
int lastptr;
int rcount;
int tcount;
int xcount;
struct timeval lasttime;
int throttle;
int throttlecount;
struct timeval lastrcv;
};
struct DATA {
int source;
pkttyp type;
int len;
const char *data;
};
struct HOST {
struct HOST *next;
unsigned char mac[6];
int bridge;
};
#define HOST_HASH 65536
struct HOST *hosts[HOST_HASH];
struct BRIDGE bridge[MAX_HOST];
int bcnt = 0;
int sd;
/* Here come the code... */
int lookup(struct sockaddr_in *sa, char *data)
{
int i;
for (i=0; i<bcnt; i++) {
if ((bridge[i].addr == sa->sin_addr.s_addr) &&
(bridge[i].port == sa->sin_port))
return i;
}
return -1;
}
int lookup_bridge(char *newbridge)
{
int i;
int l = strlen(newbridge);
#if DEBUG
printf("Trying to match %s\n", newbridge);
#endif
for (i=0; i<bcnt; i++) {
#if DEBUG
printf("Matching against: %s\n", bridge[i].name);
#endif
if ((strcmp(newbridge,bridge[i].name) == 0) &&
(l == strlen(bridge[i].name))) {
#if DEBUG
printf("Found match: %s == %s\n", newbridge, bridge[i].name);
#endif
return i;
}
}
#if DEBUG
printf("No match found\n");
#endif
return -1;
}
void add_bridge(char *name, char *dst)
{
struct hostent *he;
char rhost[40];
int port;
int i,found=0;
in_addr_t addr;
char *p;
int passive = 0;
if (bcnt < MAX_HOST) {
bzero(&bridge[bcnt],sizeof(struct BRIDGE));
if (*name == '~') {
passive = 1;
name++;
}
strcpy(bridge[bcnt].name,name);
p = strchr(dst,':');
if (p == NULL) { /* Assume local descriptor */
found = -1;
} else {
*p = ' ';
sscanf(dst,"%s %d", rhost, &port);
if ((he = gethostbyname(rhost)) != NULL) {
addr = *(in_addr_t *)he->h_addr;
found = -1;
} else {
found = inet_aton(rhost,&addr);
}
if (found) {
strcpy(bridge[bcnt].host,rhost);
bridge[bcnt].addr = addr;
bridge[bcnt].port = htons(port);
bridge[bcnt].fd = sd;
}
}
if (found) {
for (i=0; i<MAXTYP; i++) bridge[bcnt].types[i] = 0;
bridge[bcnt].rcount = 0;
bridge[bcnt].tcount = 0;
bridge[bcnt].passive = passive;
bcnt++;
#if DEBUG
printf("Adding router ''%s''. %08x:%d\n", name, addr, port);
#endif
}
} else {
printf("Warning. Bridge table full. Not adding %d (%d)\n", name, dst);
}
}
int add_service(char *newbridge, pkttyp type, char *name)
{
int i;
#if DEBUG
printf("Adding %s bridge %s.\n", name, newbridge);
#endif
if ((i = lookup_bridge(newbridge)) >= 0) {
if (bridge[i].types[type]++ > 0) {
printf("%s bridge %s added multiple times.\n", name, bridge);
}
return 1;
}
return 0;
}
void read_conf()
{
FILE *f;
int mode = 0;
int area,node;
int line;
char buf[80];
char buf1[40],buf2[40];
int i;
if ((f = fopen(CONF_FILE,"r")) == NULL) {
perror("opening bridge.conf");
exit(1);
}
for (i=0; i<bcnt; i++) {
if (bridge[i].fd != sd) close(bridge[i].fd);
}
bcnt = 0;
for (i=0; i<HOST_HASH; i++) {
struct HOST *h, *n;
h = hosts[i];
hosts[i] = NULL;
while(h) {
n = h->next;
free(h);
h = n;
}
}
line = 0;
while (!feof(f)) {
if (fgets(buf,80,f) == NULL) continue;
buf[strlen(buf)-1] = 0;
line++;
if((strlen(buf) > 2) && (buf[0] != '!')) {
if(buf[0]=='[') {
mode = -1;
if(strcmp(buf,"[bridge]") == 0) mode = 0;
if(strcmp(buf,"[decnet]") == 0) mode = 1;
if(strcmp(buf,"[lat]") == 0) mode = 2;
if(sscanf(buf,"[source %d.%d]", &area, &node) == 2) mode = 3;
if(strcmp(buf,"[relay]") == 0) mode = 4;
if(strcmp(buf,"[ip]") == 0) mode = 5;
if(mode < 0) {
printf("Bad configuration at line %d\n%s\n", line,buf);
exit(1);
}
} else {
switch (mode) {
case 0:
if (sscanf(buf, "%s %s", buf1, buf2) == 2) {
add_bridge(buf1,buf2);
} else {
printf("Bad bridge at line %d\n%s\n", line, buf);
exit(1);
}
break;
case 1:
if (!add_service(buf,DECnet,"DECnet"))
printf("%d: DECnet bridge %s don't exist.\n", line, buf);
break;
case 2:
if (!add_service(buf,LAT,"LAT"))
printf("%d: LAT bridge %s don't exist.\n", line, buf);
break;
case 3:
break;
case 4:
break;
case 5:
if (!add_service(buf,IP,"IP"))
printf("%d: IP bridge %s don't exist.\n", line, buf);
break;
default:
printf("weird state at line %d\n",line);
exit(1);
}
}
}
}
fclose(f);
}
int is_ethertype(struct DATA *d, int type)
{
return ((d->data[13] == (type & 0xff)) &&
(d->data[12] == (type >> 8)));
}
int is_decnet(struct DATA *data)
{
return is_ethertype(data, ETHERTYPE_DECnet);
}
int is_lat(struct DATA *data)
{
return (is_ethertype(data, ETHERTYPE_LAT) ||
is_ethertype(data, ETHERTYPE_MOPDL) ||
is_ethertype(data, ETHERTYPE_MOPRC));
}
int is_ip(struct DATA *data)
{
return (is_ethertype(data, ETHERTYPE_IP) ||
is_ethertype(data, ETHERTYPE_ARP) ||
is_ethertype(data, ETHERTYPE_REVARP));
}
unsigned long timedelta(struct timeval old)
{
struct timeval now;
unsigned long delta;
gettimeofday(&now, NULL);
delta = now.tv_sec - old.tv_sec;
delta *= 1000;
delta += ((now.tv_usec - old.tv_usec) / 1000);
return delta;
}
void throttle(int index)
{
long delta;
delta = timedelta(bridge[index].lasttime);
bridge[index].throttle <<= 1;
bridge[index].throttle += (delta < THROTTLETIME ? 1 : 0);
if ((bridge[index].throttle & THROTTLEMASK) == THROTTLEMASK) {
bridge[index].throttlecount++;
usleep(THROTTLEDELAY);
}
gettimeofday(&bridge[index].lasttime,NULL);
}
int active(int index)
{
if (bridge[index].passive == 0) return 1;
if (timedelta(bridge[index].lastrcv) < PASSIVE_TMO) return 1;
return 0;
}
/* do the actual sending */
void send_packet(int index, struct DATA *d)
{
struct sockaddr_in sa;
if (index == d->source) return; /* Avoid loopback of data. */
if (bridge[index].types[d->type] == 0) return; /* Avoid sending
unwanted frames */
if (active(index)) {
bridge[index].tcount++;
throttle(index);
if (bridge[index].addr == 0) {
write(bridge[index].fd,d->data,d->len); /* Local network. */
} else {
sa.sin_family = AF_INET; /* Remote network. */
sa.sin_port = bridge[index].port;
sa.sin_addr.s_addr = bridge[index].addr;
if (sendto(bridge[index].fd,d->data,d->len,0,(struct sockaddr
*)&sa,sizeof(sa)) == -1)
perror("sendto");
}
bridge[index].lastptr = (bridge[index].lastptr+1) & 7;
memcpy(bridge[index].last[bridge[index].lastptr],d->data,14);
}
}
void register_source(struct DATA *d)
{
unsigned short hash;
struct HOST *h;
hash = *(unsigned short *)(d->data+10);
h = hosts[hash];
while (h) {
if (memcmp(h->mac, d->data+6, 6) == 0) {
h->bridge = d->source;
#if DEBUG
printf("Setting existing hash to bridge %d\n", h->bridge);
#endif
return;
}
h = h->next;
}
h = malloc(sizeof(struct HOST));
h->next = hosts[hash];
memcpy(h->mac,d->data+6,6);
h->bridge = d->source;
#if DEBUG
printf("Adding new hash entry. Port is %d\n", h->bridge);
#endif
hosts[hash] = h;
}
int locate_dest(struct DATA *d)
{
unsigned short hash;
struct HOST *h;
if (d->data[0] & 1) return -1; /* Ethernet multicast */
hash = *(unsigned short *)(d->data+4);
h = hosts[hash];
while (h) {
if (memcmp(h->mac, d->data, 6) == 0) return h->bridge;
h = h->next;
}
return -1;
}
void process_packet(struct DATA *d)
{
int dst;
int i;
bridge[d->source].rcount++;
gettimeofday(&bridge[d->source].lastrcv, NULL);
for (i=0; i<8; i++) {
if (memcmp(bridge[d->source].last[i],d->data,14) == 0) {
return;
}
}
if (is_decnet(d)) d->type = DECnet;
if (is_lat(d)) d->type = LAT;
if (is_ip(d)) d->type = IP;
if (bridge[d->source].types[d->type] == 0) return;
if (d->type == -1) return;
bridge[d->source].xcount++;
register_source(d);
dst = locate_dest(d);
if (dst == -1) {
int i;
for (i=0; i<bcnt; i++) send_packet(i, d);
} else {
send_packet(dst, d);
}
}
void dump_data()
{
int i;
printf("Host table:\n");
for (i=0; i<bcnt; i++)
printf("%d: %s %s:%d (Rx: %d Tx: %d (Drop rx: %d)) Active: %d
Throttle: %d(%03o)\n",
i,
bridge[i].name,
inet_ntoa(bridge[i].addr),
ntohs(bridge[i].port),
bridge[i].rcount,
bridge[i].tcount,
bridge[i].rcount - bridge[i].xcount,
active(i),
bridge[i].throttlecount,
bridge[i].throttle & 255);
printf("Hash of known destinations:\n");
for (i=0; i<HOST_HASH; i++) {
struct HOST *h;
h=hosts[i];
while (h) {
printf("%02x%02x%02x%02x%02x%02x -> %d",
(unsigned char)h->mac[0],
(unsigned char)h->mac[1],
(unsigned char)h->mac[2],
(unsigned char)h->mac[3],
(unsigned char)h->mac[4],
(unsigned char)h->mac[5],
h->bridge);
if ((unsigned char)h->mac[0] == 0xaa &&
(unsigned char)h->mac[1] == 0x00 &&
(unsigned char)h->mac[2] == 0x04 &&
(unsigned char)h->mac[3] == 0x00) {
printf(" (%d.%d)", h->mac[5] >> 2, ((h->mac[5] & 3) << 8) + h->mac[4]);
}
printf("\n");
h = h->next;
}
}
}
int main(int argc, char **argv)
{
struct sockaddr_in sa,rsa;
int len,i,hsock;
fd_set fds;
socklen_t ilen;
int port;
struct DATA d;
char buf[8192];
signal(SIGHUP, read_conf);
signal(SIGUSR1, dump_data);
if (argc != 2) {
printf("usage: %s <port>\n", argv[0]);
exit(1);
}
if ((sd = socket(PF_INET, SOCK_DGRAM, 0)) == -1) {
perror("socket");
exit(1);
}
sa.sin_family = AF_INET;
sa.sin_port = htons(atoi(argv[1]));
sa.sin_addr.s_addr = INADDR_ANY;
if (bind(sd, (struct sockaddr*)&sa, sizeof(sa)) == -1) {
perror("bind");
exit(1);
}
read_conf();
#if DEBUG
dump_data();
#endif
while(1) {
FD_ZERO(&fds);
hsock = 0;
for (i=0; i<bcnt; i++) {
FD_SET(bridge[i].fd, &fds);
if (bridge[i].fd > hsock) hsock = bridge[i].fd;
}
if (select(hsock+1,&fds,NULL,NULL,NULL) == -1) {
if (errno != EINTR) {
perror("select");
exit(1);
}
}
for (i=0; i<bcnt; i++) {
if (FD_ISSET(bridge[i].fd,&fds)) {
d.source = i;
d.type = -1;
if (bridge[i].addr == 0) {
d.data = NULL;
d.len = NULL;//h.caplen;
if (d.data) {
process_packet(&d);
}
} else {
ilen = sizeof(rsa);
if ((d.len = recvfrom(bridge[i].fd, buf, 1518, 0,
(struct sockaddr *)&rsa, &ilen)) > 0) {
if ((d.source = lookup(&rsa, buf)) >= 0) {
d.data = buf;
process_packet(&d);
}
}
}
FD_CLR(bridge[i].fd, &fds);
}
}
}
}
Paul Koning wrote:
"Jason" == Jason Stevens <neozeed at gmail.com> writes:
Jason> Ah I had thought fragmentation was a 'feature' of TCP not
Jason> UDP.. well then that would take care of it then.
There are two things that serve similar purposes and have confusingly
similar names.
"Segmentation" is what TCP does to break user data into convenient
size chunks. That chunk size is MSS (Max Segment Size), it's
negotiated at TCP connection setup.
"Fragmentation" is what IPv4 does to break datagrams into chunks that
fit on the wire (that size is MTU, Max Transmission Unit).
Also note that MTU can differ from hop to hop over the internet. So a packet that goes over the first hop as one piece might need to be fragmented to get over the next hop.
Segmentation is more efficient than fragmentation, so with TCP the
normal approach is to pick MSS such that the resulting packets end up
no larger than MTU. In the case of UDP, the packet size is set by the
application (because each application send results in exactly one UDP
packet), so fragmentation will occur if the application sends stuff
bigger than MTU - header size.
No. Segmentation isn't more efficient, as such. However, it behaves much better in the face of lost packets.
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt at softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
Paul Koning wrote:
"Jason" == Jason Stevens <neozeed at gmail.com> writes:
>> As for fragmentation... Now I assume you are talking about the
>> encapsulation of ethernet packets in UDP packets. Those will be a
>> bit larger still, and will almost certainly be fragmented when
>> sent over the internet, yes. I don't see a problem with that. Do
>> you?
>> Jason> Well if you were trying to send the whole 1500 bytes of data +
Jason> headers in the UDP packet won't it cut stuff off? No, not unless it's either IPv6, or you set the "don't fragment" flag
in the IP header. IPv4 will fragment oversized packets no matter
what's inside, and indeed this is the only way for random size
UDPgrams to get where they are going. It should work fine. Note that
fragmentation is often not all that efficient. Compared with the
performance of old 10Mb/s DEC Ethernet gear, that's unlikely to be an
issue.
Actually, since there isn't a good way of finding out how large packets you can send, fragmentation is almost neccesary. It can occur anywhere along the way from source to destination, not only at the start or end.
And fragmentation is actually the most efficient way of getting data across. The problem is if one fragment is lost, which will cause retransmission of a lot of fragments. That's the reason TCP tries to avoid it.
But with large IP packets, you get a lower protocol overhead compared to actual data transferred.
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt at softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
Jason Stevens wrote:
On Thu, Feb 26, 2009 at 8:19 AM, Johnny Billquist <bqt at softjar.se> wrote:
Jason Stevens wrote:
Right now testing, although I'm making a 'stub' version to run under
windows so I can setup a bridge on a windows hosting box I have...
Hmm. I suspect that might turn out to be very hard, if not impossible...
One of the serious problems is that you need to be able to send out packets
with faked source MAC addresses. And all tools I've ever worked with in
Windows which need to use some other MAC address have been tricky to say the
least, not to mention that you need to reconfigure, and for some, also
reboot the machine to use another MAC address. Transparent ethernet access
from Windows seems to be difficult.
There is libpcap for windows, you can do it... although I was going to
remove the physical access, and just have it forward the packets... A
bridge in the sky as it were.
Well, normally libpcap is used to recieve packets, so it don't neccesarily help with sending them.
Second, sending packets is one thing, sending packets with a "fake" source MAC address is yet another thing (as a short note, the DEUNA and DELUA ethernet controllers for the Unibus can never do this. They set the source MAC address from the controller, no matter what you place in the packet you want to send).
But please report if you have success with this.
I noticed from the source there is no way to handle packets of 1500
bytes, as it would invariably require fragmentation... is
LAT/MOP/DecNET all smaller then 1500 bytes?
I'm not sure how you mean. The code bridge ethernet. Ethernet packets are
(normally) a maximum of 1518 bytes. 1500 bytes of data, 6 bytes source and
destination MAC, two bytes with protocol number, and four bytes of CRC.
Is there somewhere where you have identified that the bridge program don't
handle 1500 bytes of data?
I'm not really interested in jumbo ethernet frames. No DEC equipment
relevant here supports it anyway, so there is no need.
As for fragmentation... Now I assume you are talking about the encapsulation
of ethernet packets in UDP packets. Those will be a bit larger still, and
will almost certainly be fragmented when sent over the internet, yes. I
don't see a problem with that. Do you?
Well if you were trying to send the whole 1500 bytes of data + headers
in the UDP packet won't it cut stuff off? I know TCP will do the auto
fragment & reassemble, so you won't notice it... I haven't stressed
the thing out yet, just some simple ICMP stuff at the moment.. I
should add, I've added code to send thru ARP/RARP & IP since I don't
have any idea on configureing networking on VMS..
No. Two different things.
TCP intentionally sends short enough packets that they don't get fragmented. TCP then keeps track of everything to actually recreate a stream of bytes at the other end. IP will fragment packets if needed, no matter if you use UDP, TCP or any other protocol that is transported on top of IP.
That is, unless the datagram have the DF flag set (don't fragment), or the IP implementation haven't implemented fragmentation. They should, but there are always the odd implementation that might not do it.
Anyways I was thinking of adding compression support (zlib style)... I
should test it some more as I figure if the packet compresses somewhat
it'll be under the 1500 snaplen for ethernet....
Feel free. I'm not sure there is a point of doing so, but anyway.
If you want to avoid fragmentation of packets, you need to find out what the
minimum MTU are along the whole path between two nodes. And your packet then
needs to fit both the data to carry as well as the UDP and IP header within
that packet size. The minimum size that is guaranteed by IP to not be
fragmented is 576 bytes. For UDP that means a size of the data part of just
548 bytes.
And that means you need to do fragmentation yourself, since you'll never be
able to squeeze 1500 byte packets into 548 bytes with any kind of certainty.
And if you start doing the fragmentation yourself, you will in fact add to
the overhead, since then you'll create a separate packet for each fragment,
all with their IP *and* UDP headers. And you'll need to do fragment
reassembly, and what will you do if some fragment is missing? Do
retransmissions? Now we're getting into a very complex protocol with
acknowledgements and whatever. And don't forget the cpu overhead of doing
the compression/decompression of each packet. And if you start to have
fragments, acknowledgements and so on, you'll have to add information to
each packet to handle all this information, which means extra overhead there
as well. In the end, you should not be surprised if your packets end up
being larger than the current ones.
Yeah that's what I was trying to avoid.... I mean it's not that hard
to sequence things, and have it note fragment numbers, and of course
being aware of the sequence and reassembly.... But then at that point,
it'd be easier to just use TCP.. As for the compression I was thinking
more so of loading a HECnet enabled version of SIMH on an iPhone... I
know not the most 'usefull' but I think it'd be cool to have a pocket
VAX on the network...
But compression of the network traffic is still not a gain, even on an iPhone (I think).
There are several reasons I prefer using UDP over TCP for HECnet. Not the least that it's okay to actually drop packets, or get them out of order. Using UDP also adds minimal overhead. And it's naturally packet oriented. If I were to use TCP, I'd have to reassemble the stream into packets. That means I also need to transfer packet size before each packet, and a lot of "boring" extra work.
And in the "normal" case, each packet is still just generating a single,
unfragmented IP packet anyway.
Like I said. Feel free to play with it, and see what you can do. But
personally I don't think it's a good idea. You add complexity, processing
overhead, protocol overhead, and possibly a whole new level of communication
overhead, to probably no gain at any corner, but at a cost of having a
protocol that makes it harder to troubleshoot. :-)
Ok enough rambling for now.
I hope my ramblings don't scare you off, or turns you off from
experimenting. Having fun and experimenting is after all what HECnet is all
about.
Thanks for the input, yeah I need to play more with it, I do like the
realative simplicity of it... If anything now that I think I got SIMH
tied into it ok, I wanted to add Qemu support, and whatever emulators
I can also find to build up some emulated network.. And again tying
back to a stubbed windows server that I have on the internet. In the
age of multi Ghz cpu's I don't think libz would be that much
overhead..
Lot larger memory footprint, more cpu usage, maybe larger packets. :-)
I still hate watching resources wasted, even if todays machines are fast. :-)
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt at softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol
"Jason" == Jason Stevens <neozeed at gmail.com> writes:
Jason> Ah I had thought fragmentation was a 'feature' of TCP not
Jason> UDP.. well then that would take care of it then.
There are two things that serve similar purposes and have confusingly
similar names.
"Segmentation" is what TCP does to break user data into convenient
size chunks. That chunk size is MSS (Max Segment Size), it's
negotiated at TCP connection setup.
"Fragmentation" is what IPv4 does to break datagrams into chunks that
fit on the wire (that size is MTU, Max Transmission Unit).
Segmentation is more efficient than fragmentation, so with TCP the
normal approach is to pick MSS such that the resulting packets end up
no larger than MTU. In the case of UDP, the packet size is set by the
application (because each application send results in exactly one UDP
packet), so fragmentation will occur if the application sends stuff
bigger than MTU - header size.
paul
On Thu, Feb 26, 2009 at 1:53 PM, Paul Koning <Paul_Koning at dell.com> wrote:
"Jason" == Jason Stevens <neozeed at gmail.com> writes:
>> As for fragmentation... Now I assume you are talking about the
>> encapsulation of ethernet packets in UDP packets. Those will be a
>> bit larger still, and will almost certainly be fragmented when
>> sent over the internet, yes. I don't see a problem with that. Do
>> you?
>>
Jason> Well if you were trying to send the whole 1500 bytes of data +
Jason> headers in the UDP packet won't it cut stuff off?
No, not unless it's either IPv6, or you set the "don't fragment" flag
in the IP header. IPv4 will fragment oversized packets no matter
what's inside, and indeed this is the only way for random size
UDPgrams to get where they are going. It should work fine. Note that
fragmentation is often not all that efficient. Compared with the
performance of old 10Mb/s DEC Ethernet gear, that's unlikely to be an
issue.
paul
Ah I had thought fragmentation was a 'feature' of TCP not UDP.. well
then that would take care of it then.
"Jason" == Jason Stevens <neozeed at gmail.com> writes:
As for fragmentation... Now I assume you are talking about the
encapsulation of ethernet packets in UDP packets. Those will be a
bit larger still, and will almost certainly be fragmented when
sent over the internet, yes. I don't see a problem with that. Do
you?
Jason> Well if you were trying to send the whole 1500 bytes of data +
Jason> headers in the UDP packet won't it cut stuff off?
No, not unless it's either IPv6, or you set the "don't fragment" flag
in the IP header. IPv4 will fragment oversized packets no matter
what's inside, and indeed this is the only way for random size
UDPgrams to get where they are going. It should work fine. Note that
fragmentation is often not all that efficient. Compared with the
performance of old 10Mb/s DEC Ethernet gear, that's unlikely to be an
issue.
paul
On Thu, Feb 26, 2009 at 8:19 AM, Johnny Billquist <bqt at softjar.se> wrote:
Jason Stevens wrote:
Right now testing, although I'm making a 'stub' version to run under
windows so I can setup a bridge on a windows hosting box I have...
Hmm. I suspect that might turn out to be very hard, if not impossible...
One of the serious problems is that you need to be able to send out packets
with faked source MAC addresses. And all tools I've ever worked with in
Windows which need to use some other MAC address have been tricky to say the
least, not to mention that you need to reconfigure, and for some, also
reboot the machine to use another MAC address. Transparent ethernet access
from Windows seems to be difficult.
There is libpcap for windows, you can do it... although I was going to
remove the physical access, and just have it forward the packets... A
bridge in the sky as it were.
I noticed from the source there is no way to handle packets of 1500
bytes, as it would invariably require fragmentation... is
LAT/MOP/DecNET all smaller then 1500 bytes?
I'm not sure how you mean. The code bridge ethernet. Ethernet packets are
(normally) a maximum of 1518 bytes. 1500 bytes of data, 6 bytes source and
destination MAC, two bytes with protocol number, and four bytes of CRC.
Is there somewhere where you have identified that the bridge program don't
handle 1500 bytes of data?
I'm not really interested in jumbo ethernet frames. No DEC equipment
relevant here supports it anyway, so there is no need.
As for fragmentation... Now I assume you are talking about the encapsulation
of ethernet packets in UDP packets. Those will be a bit larger still, and
will almost certainly be fragmented when sent over the internet, yes. I
don't see a problem with that. Do you?
Well if you were trying to send the whole 1500 bytes of data + headers
in the UDP packet won't it cut stuff off? I know TCP will do the auto
fragment & reassemble, so you won't notice it... I haven't stressed
the thing out yet, just some simple ICMP stuff at the moment.. I
should add, I've added code to send thru ARP/RARP & IP since I don't
have any idea on configureing networking on VMS..
I've only used decnet in the 'wild' once, but we moved from the VAX to
PC's... but then our VAX was weird as it ran Novell Netware..... I
wish I had saved the tapes, as netware for the vax has to be super
ultra rare...
Yeah. That would have been fun to have kept around.
I recall it being like the netware for OS/2 it's a shame they priced
themselves out of the market...
Anyways I was thinking of adding compression support (zlib style)... I
should test it some more as I figure if the packet compresses somewhat
it'll be under the 1500 snaplen for ethernet....
Feel free. I'm not sure there is a point of doing so, but anyway.
If you want to avoid fragmentation of packets, you need to find out what the
minimum MTU are along the whole path between two nodes. And your packet then
needs to fit both the data to carry as well as the UDP and IP header within
that packet size. The minimum size that is guaranteed by IP to not be
fragmented is 576 bytes. For UDP that means a size of the data part of just
548 bytes.
And that means you need to do fragmentation yourself, since you'll never be
able to squeeze 1500 byte packets into 548 bytes with any kind of certainty.
And if you start doing the fragmentation yourself, you will in fact add to
the overhead, since then you'll create a separate packet for each fragment,
all with their IP *and* UDP headers. And you'll need to do fragment
reassembly, and what will you do if some fragment is missing? Do
retransmissions? Now we're getting into a very complex protocol with
acknowledgements and whatever. And don't forget the cpu overhead of doing
the compression/decompression of each packet. And if you start to have
fragments, acknowledgements and so on, you'll have to add information to
each packet to handle all this information, which means extra overhead there
as well. In the end, you should not be surprised if your packets end up
being larger than the current ones.
Yeah that's what I was trying to avoid.... I mean it's not that hard
to sequence things, and have it note fragment numbers, and of course
being aware of the sequence and reassembly.... But then at that point,
it'd be easier to just use TCP.. As for the compression I was thinking
more so of loading a HECnet enabled version of SIMH on an iPhone... I
know not the most 'usefull' but I think it'd be cool to have a pocket
VAX on the network...
And in the "normal" case, each packet is still just generating a single,
unfragmented IP packet anyway.
Like I said. Feel free to play with it, and see what you can do. But
personally I don't think it's a good idea. You add complexity, processing
overhead, protocol overhead, and possibly a whole new level of communication
overhead, to probably no gain at any corner, but at a cost of having a
protocol that makes it harder to troubleshoot. :-)
Ok enough rambling for now.
I hope my ramblings don't scare you off, or turns you off from
experimenting. Having fun and experimenting is after all what HECnet is all
about.
Thanks for the input, yeah I need to play more with it, I do like the
realative simplicity of it... If anything now that I think I got SIMH
tied into it ok, I wanted to add Qemu support, and whatever emulators
I can also find to build up some emulated network.. And again tying
back to a stubbed windows server that I have on the internet. In the
age of multi Ghz cpu's I don't think libz would be that much
overhead..
if anything HECnet strikes me as a neat way to plug in a bunch of
various emulators & stuff together...
Johnny
--
Johnny Billquist || "I'm on a bus
|| on a psychedelic trip
email: bqt at softjar.se || Reading murder books
pdp is alive! || tryin' to stay hip" - B. Idol