{"id":475,"date":"2014-04-09T01:57:55","date_gmt":"2014-04-09T06:57:55","guid":{"rendered":"http:\/\/sunapi386.ca\/wordpress\/?p=475"},"modified":"2014-04-09T01:57:55","modified_gmt":"2014-04-09T06:57:55","slug":"really-happens-visit-web-page","status":"publish","type":"post","link":"https:\/\/sunapi386.ca\/wordpress\/really-happens-visit-web-page\/","title":{"rendered":"What Really Happens When We Visit a Web Page"},"content":{"rendered":"<p>What Really Happens When We Visit a Web Page<\/p>\n<p>The journey down the protocol stack for a perspective of the many, many protocols that are involved in a simple request: downloading a web page.<\/p>\n<p>Our setting consists of: a student, Bob, connecting his laptop to his school\u2019s Ethernet switch and downloads a web page (www.google.com).<\/p>\n<p>Getting Started: DHCP, UDP, IP, and Ethernet<\/p>\n<p>Bob boots up his laptop and then connects it to an Ethernet cable connected to the school\u2019s Ethernet switch, which in turn is connected to the school\u2019s router. The school\u2019s router is connected to an ISP, comcast.net. Comcast.net is providing the DNS service for the school; thus, the DNS server resides in the Comcast network rather than the school network. We\u2019ll assume that the DHCP server is running within the router, as is often the case. When Bob first connects his laptop to the network, he can\u2019t do anything (e.g. visit a web page) without an IP address. Thus, the first network-related action taken by Bob\u2019s laptop is to run the DHCP protocol to obtain an IP address, as well as other information, from the local DHCP server:<\/p>\n<p>1. The operating system on Bob\u2019s laptop creates a DHCP request message and puts this message within a UDP segment with destination port 67 (DHCP server) and source port 68 (DHCP client). The UDP segment is then placed within an IP datagram with a broadcast IP destination address (255.255.255.255) and a source IP address of 0.0.0.0, since Bob\u2019s laptop doesn\u2019t yet have an IP address.<\/p>\n<p>2. The IP datagram containing the DHCP request message is then placed within an Ethernet frame. The Ethernet frame has a destination MAC addresses of FF:FF:FF:FF:FF:FF so that the frame will be broadcast to all devices connected to the switch (hopefully including a DHCP server); the frame\u2019s source MAC address is that of Bob\u2019s laptop, 00:16:D3:23:68:8A.<br \/>\n<!--more--><br \/>\n3. The broadcast Ethernet frame containing the DHCP request is the first frame sent by Bob\u2019s laptop to the Ethernet switch. The switch broadcasts the incoming frame on all outgoing ports, including the port connected to the router.<\/p>\n<p>4. The router receives the broadcast Ethernet frame containing the DHCP request on its interface with MAC address 00:22:6B:45:1F:1B and the IP datagram is extracted from the Ethernet frame. The datagram\u2019s broadcast IP destination address indicates that this IP datagram should be processed by upper layer protocols at this node, so the datagram\u2019s payload (a UDP segment) is thus demultiplexed up to UDP, and the DHCP request message is extracted from the UDP segment. The DHCP server now has the DHCP request message.<\/p>\n<p>5. Let\u2019s suppose that the DHCP server running within the router can allocate IP addresses in the CIDR block 68.85.2.0\/24. In this example, all IP addresses used within the school are thus within Comcast\u2019s address block. Let\u2019s suppose the DHCP server allocates address 68.85.2.101 to Bob\u2019s laptop. The DHCP server creates a DHCP ACK message containing this IP address, as well as the IP address of the DNS server (68.87.71.226), the IP address for the default gateway router (68.85.2.1), and the subnet block (68.85.2.0\/24) (equivalently, the \u201cnetwork mask\u201d). The DHCP message is put inside a UDP segment, which is put inside an IP datagram, which is put inside an Ethernet frame. The Ethernet frame has a source MAC address of the router\u2019s interface to the home network (00:22:6B:45:1F:1B) and a destination MAC address of Bob\u2019s laptop (00:16:D3:23:68:8A).<\/p>\n<p>6. The Ethernet frame containing the DHCP ACK is sent (unicast) by the router to the switch. Because the switch is self-learning and previously received an Ethernet frame (containing the DHCP request) from Bob\u2019s laptop, the switch knows to forward a frame addressed to 00:16:D3:23:68:8A only to the output port leading to Bob\u2019s laptop.<\/p>\n<p>7. Bob\u2019s laptop receives the Ethernet frame containing the DHCP ACK, extracts the IP datagram from the Ethernet frame, extracts the UDP segment from the IP datagram, and extracts the DHCP ACK message from the UDP segment. Bob\u2019s DHCP client then records its IP address and the IP address of its DNS server. It also installs the address of the default gateway into its IP forwarding table. Bob\u2019s laptop will send all datagrams with destination address outside of its subnet 68.85.2.0\/24 to the default gateway. At this point, Bob\u2019s laptop has initialized its networking components and is ready to begin processing the Web page fetch.<\/p>\n<p>Still Getting Started: DNS and ARP<\/p>\n<p>When Bob types the URL for www.google.com into his Web browser, he begins the long chain of events that will eventually result in Google\u2019s home page being displayed by his Web browser. Bob\u2019s Web browser begins the process by creating a TCP socket(that will be used to send the HTTP request to www.google.com. In order to create the socket, Bob\u2019s laptop will need to know the IP address of www.google.com. We learned in that the DNS protocolis used to provide this name-to-IP-address translation service.<\/p>\n<p>8. The operating system on Bob\u2019s laptop thus creates a DNS query message, putting the string \u201cwww.google.com\u201d in the question e DNS message. This DNS message is then placed within a UDP segment with a destination port of 53 (DNS server). The UDP segment is then placed within an IP datagram with an IP destination address of 68.87.71.226 (the address of the DNS server returned in the DHCP ACK in step 5) and a source IP address of 68.85.2.101.<\/p>\n<p>9. Bob\u2019s laptop then places the datagram containing the DNS query message in an Ethernet frame. This frame will be sent (addressed, at the link layer) to the gateway router in Bob\u2019s school\u2019s network. However, even though Bob\u2019s laptop knows the IP address of the school\u2019s gateway router (68.85.2.1) via the DHCP ACK message in step 5 above, it doesn\u2019t know the gateway router\u2019s MAC address. In order to obtain the MAC address of the gateway router, Bob\u2019s laptop will need to use the ARP protocol.<\/p>\n<p>10. Bob\u2019s laptop creates an ARP query message with a target IP address of 68.85.2.1 (the default gateway), places the ARP message within an Ethernet frame with a broadcast destination address (FF:FF:FF:FF:FF:FF) and sends the Ethernet frame to the switch, which delivers the frame to all connected devices, including the gateway router.<\/p>\n<p>11. The gateway router receives the frame containing the ARP query message on the interface to the school network, and finds that the target IP address of 68.85.2.1 in the ARP message matches the IP address of its interface. The gateway router thus prepares an ARP reply, indicating that its MAC address of 00:22:6B:45:1F:1B corresponds to IP address 68.85.2.1. It places the ARP reply message in an Ethernet frame, with a destination address of 00:16:D3:23:68:8A (Bob\u2019s laptop) and sends the frame to the switch, which delivers the frame to Bob\u2019s laptop.<\/p>\n<p>12. Bob\u2019s laptop receives the frame containing the ARP reply message and extracts the MAC address of the gateway router (00:22:6B:45:1F:1B) from the ARP reply message.<\/p>\n<p>13. Bob\u2019s laptop can now (finally!) address the Ethernet frame containing the DNS query to the gateway router\u2019s MAC address. Note that the IP datagram in this frame has an IP destination address of 68.87.71.226 (the DNS server), while the frame has a destination address of 00:22:6B:45:1F:1B (the gateway router). Bob\u2019s laptop sends this frame to the switch, which delivers the frame to the gateway router.<\/p>\n<p>Still Getting Started: Intra-Domain Routing to the DNS Server<\/p>\n<p>14. The gateway router receives the frame and extracts the IP datagram containing the DNS query. The router looks up the destination address of this datagram (68.87.71.226) and determines from its forwarding table that the datagram should be sent to the (receiving) router in the Comcast network. The IP datagram is placed inside a link-layer frame appropriate for the link connecting the school\u2019s router to the (receiving) Comcast router and the frame is sent over this link.<\/p>\n<p>15. The (receiving) router in the Comcast network receives the frame, extracts the IP datagram, examines the datagram\u2019s destination address (68.87.71.226) and determines the outgoing interface on which to forward the datagram towards the DNS server from its forwarding table, which has been filled in by Comcast\u2019s intra-domain protocol (such as RIP, OSP, or IS-IS, as well as the Internet\u2019s inter-domain protocol, BGP).<\/p>\n<p>16. Eventually the IP datagram containing the DNS query arrives at the DNS server. The DNS server extracts the DNS query message, looks up the name www.google.com in its DNS database (and finds the DNS resource record that contains the IP address (64.233.169.105) for www.google.com (assuming that it is currently cached in the DNS server). Recall that this cached data originated in the authoritative DNS server for google.com. The DNS server forms a DNS reply message containing this hostname-to-IP address mapping, and places the DNS reply message in a UDP segment, and the segment within an IP datagram addressed to Bob\u2019s laptop (68.85.2.101). This datagram will be forwarded back through the Comcast network to the school\u2019s router and from there, via the Ethernet switch to Bob\u2019s laptop.<\/p>\n<p>17. Bob\u2019s laptop extracts the IP address of the server www.google.com from the DNS message. Finally, after a lot of work, Bob\u2019s laptop is now ready to contact the www.google.com server!<\/p>\n<p>Web Client-Server Interaction: TCP and HTTP<\/p>\n<p>18. Now that Bob\u2019s laptop has the IP address of www.google.com, it can create the TCP socket (that will be used to send the HTTP GET message) to www.google.com. When Bob creates the TCP socket, the TCP in Bob\u2019s laptop must first perform a three-way handshake with the TCP in www.google.com. Bob\u2019s laptop thus first creates a TCP SYN segment with destination port 80 (for HTTP), places the TCP segment inside an IP datagram with a destination IP address of 64.233.169.105 (www.google.com), places the datagram inside a frame with a destination MAC address of 00:22:6B:45:1F:1B (the gateway router) and sends the frame to the switch.<\/p>\n<p>19. The routers in the school network, Comcast\u2019s network, and Google\u2019s network forward the datagram containing the TCP SYN towards www.google.com, using the forwarding table in each router, as in steps 14\u201316 above. Recall that<br \/>\nthe router forwarding table entries governing forwarding of packets over the inter-domain link between the Comcast and Google networks are determined by the BGPprotocol.<\/p>\n<p>20. Eventually, the datagram containing the TCP SYN arrives at www.google.com. The TCP SYN message is extracted from the datagram and demultiplexed to the welcome socket associated with port 80. A connection socket is created for the TCP connection between the Google HTTP server and Bob\u2019s laptop. A TCP SYNACK segment is generated, placed inside a datagram addressed to Bob\u2019s laptop, and finally placed inside a link-layer frame appropriate for the link connecting www.google.com to its first-hop router.<\/p>\n<p>21. The datagram containing the TCP SYNACK segment is forwarded through the Google, Comcast, and school networks, eventually arriving at the Ethernet card in Bob\u2019s laptop. The datagram is demultiplexed within the operating system to the TCP socket created in step 18, which enters the connected state.<\/p>\n<p>22. With the socket on Bob\u2019s laptop now (finally!) ready to send bytes to www.google.com, Bob\u2019s browser creates the HTTP GET message containing the URL to be fetched. The HTTP GET message is then written into the socket, with the GET message becoming the payload of a TCP segment. The TCP segment is placed in a datagram and sent and delivered to www.google.com as in steps 18\u201320 above.<\/p>\n<p>23. The HTTP server at www.google.com reads the HTTP GET message from the TCP socket, creates an HTTP response message (places the requested Web page content in the body of the HTTP response message), and sends the message into the TCP socket.<\/p>\n<p>24. The datagram containing the HTTP reply message is forwarded through the Google, Comcast, and school networks, and arrives at Bob\u2019s laptop. Bob\u2019s Web browser program reads the HTTP response from the socket, extracts the html for the Web page from the body of the HTTP response, and finally (finally!) displays the Web page!<br \/>\nNote: A number of possible additional protocols were omitted (e.g., NAT running in the school\u2019s gateway router, wireless access to the school\u2019s network, security protocols for accessing the school network or encrypting segments or datagrams, network management protocols), and considerations (Web caching, the DNS hierarchy) that one would encounter in the public Internet.<\/p>\n<p>References: Page 521-525 of the CS 456 networking textbook &#8220;Computer Networking &#8211; A top-down approach&#8221; by Kurose &amp; Ross (6th edition).<\/p>\n","protected":false},"excerpt":{"rendered":"<p>What Really Happens When We Visit a Web Page The journey down the protocol stack for a perspective of the many, many protocols that are involved in a simple request: downloading a web page. Our setting consists of: a student, Bob, connecting his laptop to his school\u2019s Ethernet switch and downloads a web page (www.google.com). &hellip; <a href=\"https:\/\/sunapi386.ca\/wordpress\/really-happens-visit-web-page\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">What Really Happens When We Visit a Web Page<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"class_list":["post-475","post","type-post","status-publish","format-standard","hentry","category-academica"],"_links":{"self":[{"href":"https:\/\/sunapi386.ca\/wordpress\/wp-json\/wp\/v2\/posts\/475","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/sunapi386.ca\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sunapi386.ca\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sunapi386.ca\/wordpress\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/sunapi386.ca\/wordpress\/wp-json\/wp\/v2\/comments?post=475"}],"version-history":[{"count":1,"href":"https:\/\/sunapi386.ca\/wordpress\/wp-json\/wp\/v2\/posts\/475\/revisions"}],"predecessor-version":[{"id":476,"href":"https:\/\/sunapi386.ca\/wordpress\/wp-json\/wp\/v2\/posts\/475\/revisions\/476"}],"wp:attachment":[{"href":"https:\/\/sunapi386.ca\/wordpress\/wp-json\/wp\/v2\/media?parent=475"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sunapi386.ca\/wordpress\/wp-json\/wp\/v2\/categories?post=475"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sunapi386.ca\/wordpress\/wp-json\/wp\/v2\/tags?post=475"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}