This article gives performance guidelines related to materials, shaders and rendering performance in general.
A common measure for performance in computer games is Frames Per Second (FPS). Although it gives a good overview of the overall performance, it is not suitable for more fine-grained performance analysis or expressing performance differences. The reason for this is that FPS is defined as 1/frame time and is hence a non-linear measure. An increase of 2 FPS for example when the game is running at 20 FPS, gives a profitable gain of 5 ms, while the same 2 FPS improvement on a game running at 60 FPS, will just result in a gain of 0.5 ms.
A more useful measure is the frame time. It refers to the time that each frame takes from the beginning to the end and is usually expressed in milliseconds (ms). By setting r_DisplayInfo to 2 instead of 1, it is possible to see the frame time instead of the FPS.
Useful numbers:
Performance can heavily depend on the execution environment, so it is important to use similar conditions when comparing performance numbers. The performance on systems with different hardware for example is likely to vary a lot. The GPU time is also very dependent from the screen resolution (higher resolution results usually in slower performance) or whether anti-aliasing is used or not.
If your team is targeting a game with 30 fps, each frame may not take more than 33 milliseconds to execute. Each processing step of the engine will add to the frame time. In practical terms, this means if you spend 5 ms on the Zpass, 6 ms on the general rendering pass, 10 ms on shadows (taking into account just these features), it all adds up in the end to about 21 ms which means your maximum framerate on the GPU side would theoretically be ~47 fps and would never go beyond that.
In the end, every nanosecond matters for performance, especially on consoles. So everyone including the artists and designers needs to strive for saving as much processing time as possible when creating and placing assets.
With the recent addition of deferred rendering, you should strive to do more work in a deferred fashion. For example, caustics rendering used to be a separate drawcall. With deferred, they are now done in post-processing in a single "full-screen" drawcall (with proper culling). A "deferred drawcall" on the CPU side is much cheaper than a regular forward drawcall, as you don't have to set material parameters. The main cost is assigned to GPU rendering only, which varies depending on the screen area used.
CRYENGINE has the ability to automatically optimize shadow drawcalls through the r_MergeShadowDrawcalls CVar (enabled by default). This function checks the asset for similarities in materials and merges as much as possible into a single drawcall. This means it can actually sometimes be cheaper to leave shadow casting enabled in the sub-material than it is to disable it, because disabling it might cause extra passes to be required. The cheapest possible way to render shadows is to create a dedicated "shadow proxy" mesh which mimics the shape of the overall rendermesh and is the only object in the asset which casts shadows, costing a single drawcall.
As an example, if a scene is uses 4k drawcalls on some areas, you can already estimate 4k * 10 us = ~40 ms which is already beyond budget, and this is just for vertex processing.
So the bottom line is:
Using the right shader option for your material type (eg: Vegetation should be used for vegetation) will ensure the best performance possible.
Here are specific performance cases examples:
Alpha testing is quite expensive on older PC hardware and in consoles. Using alpha testing forces you to read the diffuse texture to get the alpha channel value (and reject that sample), which forces you to skip specific fully opaque rendering optimizations.
This is especially problematic for shadow depth map rendering and Zpass since instead of doing fully opaque rendering with no texture reads, you must do it if alpha test is enabled.
On the other hand, alpha blending forces you to skip specific opaque rendering optimizations and additionally have to blend the current results with framebuffer, which on some hardware (PS3) is a bit expensive. Also some rendering techniques will not work in your asset perfectly (eg: fog will have to be computed per-vertex instead of per-pixel).
In general, try to avoid such cases whenever possible on design/art side.
Each post-effect adds a relatively big rendering cost. Try to minimize this by not enabling too many at the same time and using strategies like different timing for each post effect and reducing amount of time each post effect visible on screen.
If you really need a lot of them enabled at same time, try to merge them into a single unique post process.