Un Chti café: October 2019

Hi folks

In the first place, running out of disk space with a Cassandra cluster is not something you really want to experiment, trust me...

Caused by: java.io.IOException: No space left on device

I guess that if you're reading this post, you don't mind about this advice because it's too late though.

Why Cassandra need (so much) free space?

Under the hood, Cassandra works with internal processes which needs temporary disk space (up to the size it's already using...) such as :

Running Compactions: As SSTables are immutable, Compactions are the processes which reorganize SSTables by recreating new SSTables and to do so, use space on disk.
Keeping Snapshots: a snapshot corresponds to a copy of SSTables at a certain point of time. Cassandra uses hard link to create the snapshots. Basically, taking a snapshot is not something that will increase the disk but over the time, keeping snapshots will increase the disk (because the snapshot files are not deleted).

What can you do?

First question is : Is it only one node or your entire cluster which is running out of disk space?

If it's only one node, you can follow the proposals below but sometimes it's easier to trash the node and to replace it with a fresh new node... Thanks to Cassandra and the way it distributes the data across a cluster, you should not end up with one node with (a lot) more data than the others.

If it's your entire cluster which is running out of disk space... Well... It's where the fun begins... I can't promise that you will not loose data...

Most of the actions that you can run on a node to reclaim spaces will start to increase a bit the disk usage...

Quick wins

Stop writing data into the cluster

It sounds a bit weird but yes, first of all let's stop the bleeding...

Clear Snapshots

run on each node:

nodetool cfstats

nodetool listsnapshots

Theses commands will show you if you have any snapshots. If so they are good candidates to reclaim spaces and then to delete them:

nodetool clearsnapshot

Increase the disk size of the nodes

If you can temporary increase the disk size of your nodes, it's worth to do it to get back on a less critical state first. A state, where you can think of adding nodes and so on. If you're running your cluster on the cloud, most of the cloud provider provide ways to extend the disk. In most of cases you'll have to stop and start the VMs hosting your nodes.

Remove data (if you can...)

It can be a bit extreme as well but depending on your context and your use case it's perhaps possible...

You can drop or truncate tables. This solution is quite efficient because no tombstones are written. Cassandra just create a snapshot of the table when you run the command. The disk space is released when you clear the snapshot.

Not so quick win

Add nodes

This is the usual procedure... If your cluster needs more space, add more nodes...

Adding nodes means that the ownership data is changing between nodes. Cassandra does not automatically release the data which has been moved to other nodes. Do not forget to run a cleanup on each nodes:

nodetool cleanup

If you're still very limited in terms of available disk space, be careful because running a cleanup temporary increase the disk space. You can limit this increase by running the cleanup table per table :

nodetool cleanup yourkeyspace yourtable

Remove data (if you can...)

The other way to delete data by inserting tombstones in your cluster. To avoid to wait the gc_grace_seconds parameter before the tombstones will be evicted (by default it's 10 days), you can change the value by using the ALTER cql command. Before doing that check, check that old nodes are Up :

ALTER TABLE keyspace.yourtable WITH gc_grace_seconds = 3600

Best advice

If you survived to the issue, I'm pretty sure that you don't want to face it one more time. I would warmly recommend to monitor the disk usage of the nodes in your cluster.

If you're using the SizeTieredCompactionStrategy (which is the worst regarding the needed free space) a good practice is to keep your disk usage below 50 to 60% and to add nodes before reaching the 70% threshold.

There's a currently an opened ticket to tackle this type of problem:

https://issues.apache.org/jira/browse/CASSANDRA-14499

Enjoy!

Hi Cassandra folks,

I would like to share with you my experience regarding the organization of a Cassandra Lan party.

<;-)>

Are we still the worldwide Cassandra Lan party record holder with 36 nodes? 🤔https://t.co/VH9PIc2o9j @nromanetti @framiere #itwassocool #Cassandra
— Jérémy Sevellec (@jsevellec) September 20, 2019

It looks like I'm still the record holder of the biggest Cassandra Lan party organized with a 36 nodes cluster. I would be so happy if someone can beat this record!

</;-)>

I've already organized a bunch of Cassandra Lan party during conferences or meetups. If you want to have it done the right way and to avoid to loose time or just to fail, it's a bit of work and preparation...

Here's my How-to regarding the preparation and the run of a Cassandra Lan party. Feel free to use it or to just take a part of it! I know that this How-to works because i did it several times. It does not mean that you can't do it in another ways though.

Special thanks to friends who were already part of Cassandra Lan party organizations with me:

The goal

The basic goal of a Cassandra Lan Party is:

to create a network with the attendees' laptop
to create a Cassandra cluster on top of it. One node per attendee
to let attendees manipulate data of the cluster all together
to play with the cluster (like "oh s..., we lost one data center" and so on)

The ideal situation is also to be able to:

setup a multi data center Cassandra cluster. 3DCs with a replication factor of 3 per DC
be isolated in terms of network

1. Preparation

You need a Team:

1 driver who displays slides
1 team member per DC who:

knows the procedure
will be responsible of doing the network setup of the data center
will be a seed node with his own machine
will help the other attendees who are part of the data center

So basically, a team of 4 peoples if you want to simulate a 3 data centers clusters is an ideal setup.

You need network equipments :

1 small switch (at least 4 ports)
3 big switches (the bigger they are the more attendees you can get)
3 RJ45 network cables at least 3 ou 5 meters long to connect switches togethers
as many as RJ45 network cables as you can for the attendees (a mix of 1m, 2m and 3m is perfect)

You need additional equipments:

a luggage to carry everything in

flashy t-shirts, so attendees can identify the Cassandra Lan Party team easily

USB keys
few power strips
a scotch roller
sticky notes
few felt pens

I know that it's the most tricky part is to find the network equipments part. In my case, what I did was to ask some companies around me if they would lend me the network equipments by giving them a bit of visibility during the Lan Party and it worked fine...

Also, regarding USB Keys, the goal is to load them with:

JDK8 for all OS platforms
Python for windows
The Cassandra binary

If you don't have any USB Keys, you also have the option to distribute the archives thanks to the Cassandra Lan Party Configurator that the driver will run from his laptop once the network setup id done during the Cassandra Lan party (more details below).

In addition to that, You need:

Some Slides to guide the attendees
the Cassandra Lan party configurator (again, more details below)...

And that's it!

I would advice, if you can, to test the network equipment before the Cassandra Lan party D Day. You can also organize a repetition just with the team or with few close friends. It enables to feel more relax for the D day.

If you can contact the attendees before the event, you can ask them :

to come with a laptop
install a JDK 8
yo come with a RJ45 adaptor for the laptop (like for the recent Mac for instance...)

2. Room preparation

To better picture the DCs physically, group tables to create 3 areas which will define your 3 DCs. One big switch per DC.

You can also spread sticky notes and pens on the tables.

Here is an example of a room setup i did for a repetition:

Each table is a DC + the table at the back with the driver laptop which also display slides and the smaller switch.

3. Welcome attendees

Just few things about that :

try to guide them to spread them across all the DCs. the Ideal situation is to have the same number of attendees per DC (or close to). You need to have at least 2 attendees per DC + the team member of the DC (the setup is using a RF of 3 per DC).
display a slide to kindly ask people to not touch anything. Regarding my experience, Attendees are sometimes quite excited to participate and want to go too fast. The idea is to go slow at the beginning to not fail the Cassandra Lan Party.
distribute USB Keys
start the Cassandra lan Party with slides

4. Network Setup

The first thing is to ask attendees to switch off WIFI.

The goal is to create this network setup :

The small switch (WORLD) is connecting the 3 DCs (LILLE, SAN FRANCISCO and SINGAPORE) all together. The driver laptop is connected to the small switch and all the other attendees and team members are connected to a DC switch.

The Ip allocation is done manually to be able to control it.

As we will use the RackInferringSnitch of Cassandra to distribute the node across the DCs. IPs of Attendees have to look like:

10.X.1.Y

where:

X corresponds to the number of the DC (in our case "1" for Lille, "2", for SF, ...)
Y is an incremental number given by the DC team member. "1" will be used for each DC team member
10.1.1.0 will be for the driver laptop

For example for the Lille DC:

10.1.1.1 : DC team member
10.1.1.2 : 1st DC attendee
10.1.1.3 : 2nd DC attendee
...

To avoid any issue and a big network mess, I would advice to use sticky notes to write each IP with the name of the attendee on the closest wall to the DC and also to control the plugging into the switch.

Once connected you can ask attendees to do the network configuration :

IP : the one from the sticky note
network mask : 255.0.0.0
no proxy
no gateway

And then to ask them to ping:

10.1.1.0
10.1.1.1
10.2.1.1
10.3.1.1

5. A bit of Cassandra theory

Usually, once the network setup is done or if there's one or two laptops causing issues, I would recommend to do a bit of Cassandra theory with few slides. How deep you need to go depends a bit on the experience of the attendees. Up to you to decide how far you want to go. The good thing is that it let a bit of time to try to solve network issues in the background.

6. Cassandra Setup

Again, the first gentle reminder that you have ask to attendees is to not start Cassandra before the green light of the DC team member.

How to do the distributed setup : There's an app for that.

The Cassandra Lan Party Configurator is a web application that will help attendees to do the setup of Cassandra on their machine. The goal is to let attendees to do the setup themselves but guided thanks to the application. The application is OSS and Apache licensed.

Main features are :

attendees configuration helper
archives provider
cluster status

I would recommend to run the application from the driver laptop.

7. Cassandra cluster start

Once all attendees are done with the Cassandra setup, you can start nodes.

As we set the auto_bootstrap properties to false, it enables to accelerate the creation of the cluster. The drawback is that if an attendee is joining after the creation of the keyspace, he will have to remove the auto_bootstrap properties to join the cluster and to receive data.

First, start the seeds nodes which should correspond to each of the DC team member. Then, once the seeds nodes are there, you can let each of the DC team member adding nodes one after each other asking attendee to start his node.

8. Play with the Cassandra cluster

Once everyone is there, create a keyspace and one table so all attendees will be able to play with.

The important thing is to create the keyspace with the good strategy and using a RF of 3 per DC.

CREATE KEYSPACE "lanparty" WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', '1' : 3, '2' : 3, '3' : 3};
use lanparty;
CREATE TABLE attendee (
  email text primary key,  first_name text,  last_name text);

Then you can ask every attendees to play with CQLSH to insert and read data...

and The cool thing is that you can simulate DC connections issue by just unplugging the network cable of one DC Switch from the small switch.

I let your imagination do the rest...

Conclusion

I hope this How-to can help and I'm looking forward for feedback of new Cassandra Lan party!

As you have seen the network setup with switches and network cables is the biggest part of the Cassandra Lan Party. It's also a funny part because attendees really like this part of the setup.

I think that it could be possible to replace the 3 big switches and the network cables by 3 WIFI points.

Don't hesitate to contact me if you have any questions!

Jérémy

Un Chti café

Pages

Tuesday, October 8, 2019

Cassandra : What to do when you run out of disk space?