-
Notifications
You must be signed in to change notification settings - Fork 69
/
networking_monitoring.prolific
287 lines (215 loc) · 17 KB
/
networking_monitoring.prolific
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
View Application Security Groups
### What?
Application Security Groups are a collection of egress (outbound) rules that specify the protocols, ports, and IP ranges where application containers can send traffic. Security groups define rules that *allow* traffic instead of omitting it, which means that the order of evaluation for security groups that apply to the same space, org, or deployment is unimportant. Application containers use these rules to filter and log outbound network traffic.
When applications begin staging, they need traffic rules permissive enough to allow them to pull resources from the network. After an application is running, the traffic rules can be more restrictive and secure. To distinguish between these two security requirements, administrators can define different security groups for *staging* containers versus *runtime* containers.
To provide granular control when securing a deployment, an administrator can also assign security groups to apply across a CF deployment, or to specific spaces or orgs within a deployment.
### How?
1. As admin view the list of security groups
`cf security-groups`
1. View the security groups assigned to staging containers
`cf staging-security-groups`
1. View the security groups assigned to running containers
`cf running-security-groups`
1. View the specific rules of each group using
`cf security-group $group_name`
#### Expected Result
If you've deployed a full Cloud Foundry on GCP you should have two security groups applied to both staging and running apps: `public_networks` and `dns`.
If you run `cf security-group public_networks` you'll see that it allows traffic on all ports with all protocols on all IPs, save four gaps that correspond to [private IPv4 address spaces](https://en.wikipedia.org/wiki/Private_network#Private_IPv4_address_spaces) and the [APIPA reserved range](https://www.pctechbytes.com/networking/fix-169-254-address-problem/). The `dns` security group allows access to any IP, but only on [port 53](https://en.wikipedia.org/wiki/Domain_Name_System#Protocol_transport). The combination of the two groups is that private IPs can be accessed only on port 53, and all other IPs have all ports open.
(If you're working with PCF Dev, you should see three security groups, one of which is named `all_pcfdev` and opens all egress traffic. Because of the `all_pcfdev` security group any other group would be redundant.)
### Resources
[Application Security Groups Documentation](https://docs.cloudfoundry.org/adminguide/app-sec-groups.html)
[Typical Application Security Groups](https://docs.cloudfoundry.org/adminguide/app-sec-groups.html#typical-groups)
["Taking Security to the Next Level—Application Security Groups" by Abby Kearns](https://tanzu.vmware.com/content/blog/taking-security-to-the-next-level-application-security-groups)
["Making sense of Cloud Foundry security group declarations" by Sadique Ali](https://sdqali.in/blog/2015/05/21/making-sense-of-cloud-foundry-security-group-declarations/)
L: security
---
Allowing network traffic to other apps (AKA Configure container networking policies)
### What?
By default, security groups are configured not to allow app containers to address each other directly.
For example, if `app1` has an overlay IP address 10.255.11.9 and `app2` has an overlay IP address 10.255.11.14,
`app1` would fail to curl 10.255.11.14, and `app2` would fail to curl 10.255.11.9.
[Container networking features](https://github.com/cloudfoundry/cf-networking-release) allow `cf` users
to define network policies between apps that override security groups.
Let's see how can manage the networking security of an app that needs to communicate with other apps.
### Prerequisites
Go through the **Basic BOSH Knowledge** story to gain familiarity with BOSH, specifically BOSH CLI commands.
### How?
1) Deploy service discovery
Service Discovery is a container networking feature that allows CF apps to communicate with each other via internal routes.
This way appA doesn't need to know appB's overlay IP address.
Instead appA can just make a request to an internal route that might look like `appB.apps.internal`
and the service discovery controller will translate that route into the overlay IP.
Service Discovery is not enabled by default, but can be enabled by using this opsfile,
[Enable Service Discovery](https://github.com/cloudfoundry/cf-deployment/blob/master/operations/enable-service-discovery.yml).
This opsfile also seeds the internal domain `apps.internal` for you.
The following commands download the BOSH manifest file into a tmp file, adds `enable-service-discovery.yml` to the manifest, and updates your BOSH deployment with service discovery enabled. You may want to create an additional directory under your `~/workspace` directory to contain your manifests and other such configurations. Also, ensure that your BOSH environment variables are set (this will depend on how you set up your onboarding environment).
```
bosh -d cf manifest > /tmp/cf.yml
bosh deploy /tmp/cf.yml -o ../cf-deployment/operations/enable-service-discovery.yml -d cf
```
To confirm an internal domain was seeded, run `cf domains` and see that there is a domain with the name `apps.internal` and that it is listed as `internal` under `details`.
2) Make sure you have two apps pushed. If don't have any apps pushed, `cf push boots -p <PATH_TO_DORA> && cf push swiper -p <PATH_TO_DORA>`.
3) Map an internal route to swiper and boots
Internal routes can only be used for CF apps to talk to each other via container networking. Internal routes are created the same way other routes are, but they are associated with an internal domain.
```
cf map-route swiper apps.internal --hostname swiper
cf map-route boots apps.internal --hostname boots
```
Now if you run `cf app swiper` you should see the route `swiper.apps.internal` listed and if you run `cf app boots` you should see the route `boots.apps.internal` listed.
Before applying a network policy
4) In two different terminals, `cf ssh` to boots' and swiper's application containers.
5) On `boots`, run `curl swiper.apps.internal:8080`. See that the connection fails because there isn't a container networking policy.
6) On `swiper`, run `curl boots.apps.internal:8080`. See that the connection fails because there isn't a container networking policy.
Apply the network policy to allow access:
7) On your command line, run `cf add-network-policy boots --destination-app swiper --protocol tcp --port 8080`
See if the app containers can communicate
8) In your ssh session on boots, try this again: `curl swiper.apps.internal:8080`. It works! Hopefully!
9) Now try the reverse, `curl boots.apps.internal:8080` on swiper. There is not policy in this direction, so it should fail!
### Expected Result
Before the network policy is added, the initial curls from both apps should fail. After applying the networking policy, the curl from boots to swiper should succeed.
But because not policy was added in the other direction, from swiper to boots, that curl should never succeed.
Poke around to see what else you can and can't do with the network policy commands.
Can you allow access to mutliple ports?
What happens if you allow access to a port that the app doesn't listen to? Is it give the same error?
### Resources
[Container networking docs](https://github.com/cloudfoundry/cf-networking-release/tree/develop/docs)
[Docs: Understanding CF Networking](https://docs.cloudfoundry.org/concepts/understand-cf-networking.html)
[Docs: Deploy Apps with CF C2C Networking](https://docs.cloudfoundry.org/devguide/deploy-apps/cf-networking.html)
[Video: CF Networking: All Your Packets are Belong to Us](https://www.youtube.com/watch?v=WrTXd42_m10)
L: security, onboarding-lite
---
Inspect your container to container networking packets with tcpdump
### What?
Sometimes a user says "container to container networking is broken! AppA can't talk to AppB". After making sure that they have container to container networking (c2c) policies set, the next thing you might do is use tcpdump to inspect the packets.
Tcpdump is a CLI tool that allows you to inspect all of the traffic (not just TCP, despite the name) flowing through your container. This is a great tool for debugging how far packets are making it, determining how long the packets are taking to reach their destination, and figuring out where in the TCP handshake connections are failing.
In this story you are going to look at the packets being sent from boots to swiper. Then you'll watch the packets being sent in response.
### Prerequisites
Go through the **Basic BOSH Knowledge** story to gain familiarity with BOSH, specifically BOSH CLI commands.
### How?
**Setup**
1. You should have two apps pushed named boots and swiper from the previous story ("Allowing network traffic to other apps").
1. There should be no c2c network policies. Remove any policies that are left over from the previous story. (`cf network-policies` and `cf remove-network-policy --help`).
**Curl swiper from boots**
1. Get the overlay IPs of boots and swiper.
```
cf ssh boots -c "env | grep CF_INSTANCE_INTERNAL_IP"
cf ssh swiper -c "env | grep CF_INSTANCE_INTERNAL_IP"
```
1. Ssh onto boots.
```
cf ssh boots
```
1. Continually try to curl swiper from boots
```
watch -n 15 curl SWIPER_OVERLAY_IP:8080
```
**Look at those packets**
1. In another terminal, ssh onto the Diego Cell where boots is running and become root (see the help section if you don't know how to do this).
1. Run `tcpdump`.
Ahhhhh too much information! ctrl+c! ctrl+c! On a Diego Cell there are many packets being sent around, and tcpdump gives information about ALL OF THEM. You need to figure out a way to filter this overwhelming stream of information.
1. Filter by packets where the source IP is BOOTS_OVERLAY_IP and where the destination IP is SWIPER_OVERLAY_IP over any interface. (Interested in learning more about network interfaces? It's a little too much to go into here. Contact Amelia Downs about CF Networking Onboarding week! Or google it.)
```
tcpdump -n src BOOTS_OVERLAY_IP and dst SWIPER_OVERLAY_IP -i any
```
Hey! Those are your packets!
Record the packets you see here from one curl.
```
# PUT TCPDUMP OUTPUT HERE
```
If swiper was successfully responding, then you should also see packets being sent as a response in the opposite direction.
1. See that no packets are being sent back from swiper to boots
```
tcpdump -n src SWIPER_OVERLAY_IP and dst BOOTS_OVERLAY_IP -i any
```
**Add c2c policy**
1. Add c2c policy to allow traffic from boots to swiper (`cf add-network-policy --help`)
1. In your original terminal, you should still be curling swiper from soots (if not go back to "curl swiper from boots" section).
**Inspect packets**
1. Look for packets from boots to swiper
```
tcpdump -n src BOOTS_OVERLAY_IP and dst SWIPER_OVERLAY_IP -i any
```
Record the packets you see here from one curl.
```
PUT TCPDUMP OUTPUT HERE
```
How are these packets different from before?
1. Look for packets from swiper to boots
```
tcpdump -n src SWIPER_OVERLAY_IP and dst BOOTS_OVERLAY_IP -i any
```
### Expected Result
You should see packets being sent in response from swiper to boots. You should see `200 OK`.
### Help
How to determine what Diego Cell your app is running on:
```
# First determine the IP of the Diego Cell
cf ssh APP_NAME -c "env | grep CF_INSTANCE_IP"
# Look at bosh output to see which Diego Cell has that IP
bosh instances
```
### Resources
[tcpdump man page](https://www.tcpdump.org/manpages/tcpdump.1.html)
[helpful tcpdump commands](https://www.rationallyparanoid.com/articles/tcpdump.html)
L: security, onboarding-lite
---
Allowing network traffic to other, non-app, private IPs (AKA Making custom ASGs)
### What?
In the previous story, you used container networking policies to allow traffic between app containers. Cool beans.
But, what if you want your app to be able to establish connections with other private IP addresses that aren't apps?
If you need help thinking of a reason you'd want to do this, imagine trying to connect your app to anything that was deployed by BOSH, which mostly deploys to private IPs (which, as we've pointed out in previous stories, are blocked from app containers by default).
The most common example of this are BOSH-deployed services like [cf-mysql](https://github.com/cloudfoundry/cf-mysql-deployment).
However, rather than force you to set up another BOSH deployment, let's instead use our existing CF deployment, and see if we can't think of some CF component as a "service" that an app may want to leverage.
For the sake of this story, let's try to query the router for its list of routes (it's a thing you can do!). We'll create our own ASG to allow our app containers to open a connection to the router's private IP.
### Prerequisites
Go through the **Basic BOSH Knowledge** story to gain familiarity with BOSH, specifically BOSH CLI commands.
### How?
The basic outline of this workflow is this:
1. Curl the router's `/routes` endpoint from an app container, and see that the connection isn't allowed.
1. Create and bind an ASG that allows your app container to connect to the IP address of the router.
1. Try curling the `/routes` endpoint again, and see that you get successful response from the router.
Construct the URL for querying the router
In order to curl the router's `/routes` endpoint, you'll to get the router's IP address, port, as well as special credentials (username and password) for making this request. Don't worry, you can find all of this in your manifest or vars-store/Credhub.
- Router IP Address: Run `bosh instances` and look for an instance group called `router` and copy the IP address.
- Router Status Port: routing-release [sets port 8080 as the default](https://github.com/cloudfoundry/routing-release/blob/b63e07857d90df0dc58339416bb09e860e32cf46/jobs/gorouter/spec#L38-L40) -- unless you wrote an ops-file to override the default, you should use this port.
- Router Status Username: Again, [routing-release has provided a default](https://github.com/cloudfoundry/routing-release/blob/b63e07857d90df0dc58339416bb09e860e32cf46/jobs/gorouter/spec#L41-L43). Go with that, most likely.
- Router Status Password: cf-deployment uses [a BOSH variable called `router_status_password` to configure this password](https://github.com/cloudfoundry/cf-deployment/blob/2118b5d8ac09feed153f6eea552fd561dd6aa999/cf-deployment.yml#L865). The way to fetch this value depends on whether you used Credhub or a vars-store to maintain your credentials.
For Credhub, make sure you've run `eval "$(bbl print-env)"` (don't forget the double quotes!). Then run `credhub find -n router_status_password` to discover the full name of the variable (it will look something like `/bosh-bbl-env-something/cf/router_status_password`). Copy the full name, and run `credhub get -n $FULL_NAME` to get find the password.
For vars-store, just grep `router_status_password` from your vars-store
Before creating the ASG:
1. Make sure you have an app pushed. If don't have one running, `cf push dora -p <PATH_TO_DORA>`.
1. `cf ssh` to dora's application container, and then run `curl router-status:$ROUTER_PASSWORD@$ROUTER_IP:8080/routes`.
Create and bind an ASG:
1. Write a file with contents like this:
```
[
{
"protocol": "tcp",
"destination": "$ROUTER_IP/32",
"ports": "8080",
"description": "Allow apps to query router-status"
}
]
```
1. On your command line, run `cf create-security-group router-asg /PATH/TO/ASG_FILE.json`
1. Bind the ASG to the space that includes your pushed app: `cf bind-security-group router-asg $MY_ORG $MY_SPACE`
1. Restart dora to allow the ASG to take effect: `cf restart dora`
See if the app container can communicate with the router:
1. SSH back onto dora, and re-run `curl router-status:$ROUTER_PASSWORD@$ROUTER_IP:8080/routes`
### Expected Result
The initial curl should fail. After applying the security group, the curl should succeed.
Poke around to see what else you can and can't do with the ASG commands.
Can you allow access to mutliple ports?
What happens if you allow access to a port that the router doesn't listen to? Does it give the same error?
L: security
---
Interested in learning more about networking?
🤓 If you are interested in learning even more about networking, check out these [CF Networking Onboarding stories](https://github.com/pivotal/cf-networking-program-onboarding).
🐞 These stories are intended to help engineers to learn how container networking and routing components work under-the-hood and how to debug them.
➕ To add these stories to this tracker: clone the [repo](https://github.com/pivotal/cf-networking-program-onboarding), run the build script, and then import the created CSV to your tracker.
🙋 Also, the networking program is experimenting with running CF _Networking_ Onboarding weeks, much like the CF Onboarding week that you are apart of right now.
So instead of doing these extra stories now (there are a lot of them), you could sign up for a Networking Onboarding week. Reach out to Amelia Downs if this interests you.
📅 Or if you don't have yet another week free to learn, you could complete the stories in your flex time.
---
[RELEASE] Networking and Monitoring ⇧
L: onboarding-lite