What is the best way: if I use glDrawArrays, or if I use glDrawElements? Any difference?
For both, you pass OpenGL some buffers containing vertex data.
glDrawArrays is basically "draw this contiguous range of vertices, using the data I gave you earlier".
Good:
You don't need to build an index buffer
Bad:
If you organise your data into GL_TRIANGLES, you will have duplicate vertex data for adjacent triangles. This is obviously wasteful.
If you use GL_TRIANGLE_STRIP and GL_TRIANGLE_FAN to try and avoid duplicating data: it isn't terribly effective and you'd have to make a rendering call for each strip and fan. OpenGL calls are expensive and should be avoided where possible
With glDrawElements, you pass in buffer containing the indices of the vertices you want to draw.
Good
No duplicate vertex data - you just index the same data for different triangles
You can just use GL_TRIANGLES and rely on the vertex cache to avoid processing the same data twice - no need to re-organise your geometry data or split rendering over multiple calls
Bad
Memory overhead of index buffer
My recommendation is to use glDrawElements
The performance implications are probably similar on the iphone, the OpenGL ES Programming Guide for iOS recommends using triangle strips and joining multiple strips through degenerate triangles.
The link has a nice illustration of the concept. This way you could reuse some vertices and still do all the drawing in one step.
For best performance, your models should be submitted as a single unindexed triangle strip using glDrawArrays with as few duplicated vertices as possible. If your models require many vertices to be duplicated (because many vertices are shared by triangles that do not appear sequentially in the triangle strip or because your application merged many smaller triangle strips), you may obtain better performance using a separate index buffer and calling glDrawElements instead. There is a trade off: an unindexed triangle strip must periodically duplicate entire vertices, while an indexed triangle list requires additional memory for the indices and adds overhead to look up vertices. For best results, test your models using both indexed and unindexed triangle strips, and use the one that performs the fastest.
Where possible, sort vertex and index data so that that triangles that share common vertices are drawn reasonably close to each other in the triangle strip. Graphics hardware often caches recent vertex calculations, so locality of reference may allow the hardware to avoid calculating a vertex multiple times.
The downside is that you probably need a preprocessing step that sorts your mesh in order to obtain long enough strips.
I could not come up with a nice algorithm for this yet, so I can not give any performance or space numbers compared to GL_TRIANGLES. Of course this is also highly dependent on the meshes you want to draw.
Actually you can degenerate the triangle strip to create continuous strips so that you don't have to split it while using glDrawArray.
I have been using glDrawElements and GL_TRIANGLES but thinking about using glDrawArray instead with GL_TRIANGLE_STRIP. This way there is no need for creating inidices vector.
Anyone that knows more about the vertex cache thing that was mentioned above in one of the posts? Thinking about the performance between glDrawElements/GL_TRIANGLE vs glDrawArray/GL_TRIANGLE_STRIP.
The accepted answer is slightly outdated. Following the doc link in Jorn Horstmann's answer, OpenGL ES Programming Guide for iOS, Apple describes how to use "degenerate-triangles" trick with DrawElements, thereby gaining the best of both worlds.
The minor savings of a few indices by using DrawArrays isn't worth the savings you get by combining all your data into a single GL call, DrawElements. (You could combine all using DrawArrays, but then any "wasted elements" would be wasting vertices, which are much larger than indices, and require more render time too.)
This also means you don't need to carefully consider all your models, as to whether most of them can be rendered as a minimal number of strips, or they are too complex. One uniform solution, that handles everything. (But do try to organize into strips where possible, to minimize data sent, and maximize GPU likeliness of re-using recently cached vertex calculations.)
BEST: A single DrawElements call with GL_TRIANGLE_STRIP, containing all your data (that is changing in each frame).
Related
I have many fixed objects like terrains and buildings and I want to merge them all in one VBO to reduce draw calls and enhance performance when there are too many objects, I load textures and store their ids in an array, my question is can I bind textures to that one VBO or must I make a separate VBO for each texture? or can I make many glDrawArrays for one VBO based on offset and length, if I can do that will this be smooth and well performed?
In ES 2.0, if you want to use multiple textures in a single draw call, your only good option is to use a texture atlas. Essentially, you store the texture data from multiple logical textures in a single OpenGL texture, and the texture coordinates are chosen so that the desired texture data is used for each primitive. This could be done by adjusting the original texture coordinates, or by feeding an id into the shader and applying an offset to the texture coordinates based on the id.
Of course you can use multiple glDrawArrays() calls for a single VBO, with binding a different texture between them. But that goes against your goal of reducing the number of draw calls. You should certainly make sure that the number of draw calls really is a bottleneck for you before you spend a lot of time on these types of optimizations.
In more advanced versions of OpenGL you have additional features that can help with this use case, like array textures.
There are couple of standard techniques that many Game Engines perform to achieve low draw calls.
Batching: This technique combines all objects referring to same material and combines them into one mesh. The objects does not even have to be static. If they are dynamic you can still batch them by passing the Model Matrix array.
Texture Atlas: Creating texture atlas for all static meshes is the best way as said in the other answer. However, you'll have to do a lot of work for combining the textures efficiently and updating their UVs accordingly.
In opengl or opengl-es you can use indices to share a vertices. This works fine if you are only using vertex coords and texture coords that don't change, but when using normals, the normal on a vertex may change depending on the face. Does this mean that you are essentially forced to scrap vertex sharing in opengl? This article http://www.opengl-tutorial.org/intermediate-tutorials/tutorial-9-vbo-indexing/
seems to imply that this is the case, but I wanted a second opinion. I'm using .obj models so should I just forget about trying to share verts? This seems like it would increase the size of my model though as I iterate and recreate the array since i am repeating tons of verts and their tex/normal attributes.
The link you posted explains the situation well. I had the same question in my mind couple months ago.I remember I read that tutorial.
If you need exactly 2 different normal, so you should add that vertex twice in your index list. For example, if your mesh is a cube you should add your vertices twice.
Otherwise indexing one vertex and calculating an average normal is kind of smoothing your normal transitions on your mesh. For example if your mesh is a terrain or a detailed player model etc. you can use this technique which you save free space and get better looking result.
If you ask how to calculate average normal, I used average normal calculating algorithm from this question and result is fast and good.
If the normals are flat faces then you can annotate the varying use in the fragment shader with the "flat" qualifier. This means only the value from the provoking vertex is used. With a good model exporter you can get relatively good vertex sharing with this.
Not sure on availability on GLES2, but is part of GLES3.
Example: imagine two triangles, expressed as a tri-strip:
V0 - Norm0
V1 - Norm1
V2 - Norm2
V2 - Norm3
Your two triangles will be V0/1/2 and V1/2/3. If you mark the varying variable for the normal as "flat" then the first triangle will use Norm0 and the second triangle will use Norm1 (i.e. only the first vertex in the triangle - known as the provoking vertex - needs to have the correct normal). This means that you can safely reuse vertices in other triangles, even if the normal is "wrong" provides that you make sure that it isn't the provoking vertex for that triangle.
There are many examples for OpenGL ES 2 in how to visualize a single triangle or rectangle.
Google provides an example for drawing shapes (triangles, rectangles) by creating a Triangle and Rectangle class which basically do all the opengl-stuff required for visualize these objects.
But what should you do, if you have more than one triangle? What if you have objects, consists of hundreds of triangles of different colors, different sizes and positions? I can't find any good tutorial for dealing with complex scenarios in opengl es.
My approaches:
So I tried it out. First of all I've slightely changed the Triangle-Class to a more dynamic class (the constructor now gets the position and the color of the triangle). Basically this is "enough" for drawing complexe scenes. Every object would consist out of hundreds of these Triangle-classes and I render each of them seperately. But this consumes much computing power and I think most of the steps in the rendering process are redundant.
So I tried to "group" triangles into different categories. Now every object has his only vertexbuffer and puts every triangle at once in it. Now the performance is far better than before (where every triangle had his own buffer) but I still think, that it's not the correct way to go.
Is there any good example in the internet, where someone is drawing more than simple triangles or do you know where I can get these information from? I really like OpenGL but it's pretty hard for beginners because of the lack of tutorials (for OpenGL ES 2 in Android).
The standard representation of (triangle) meshes for rendering is using a vertex array containing all the vertices in the mesh, and an index array connecting storing the connectivity (triangles). You definitively want at most one draw call per object (but you might even be able to coalesce several objects).
Interleaved attribute arrays are the most efficient variant wrt. cache efficiency, so one Buffer object for the VA per object is enough. You might even combine several objects into one buffer object, even if you can not use a single draw call for both.
As GLES might be limited to 16 Bit indices, large models must be splitted into several "patches".
Although I'm technically working in the android platform with OpenGL 2.0 ES, I believe this can be applied to more OpenGL technologies.
I have a list of objects (enemies, characters, etc) that I'm attempting to draw onto a grid, each space being 1x1, and each object matching. Presently, each object is self translating... that is, it's taking its model coordinates and going through a simple loop to adjust them to be located in the world coordinates in its appropriate grid location. (i.e. if it should be at (3,2) it will translate it's coordinates accordingly.
The problem I've reached is I'm not sure how to effeciently draw them. I have a loop going through all the objects and calling draw for each object, similar to the android tutorial, but this seems wildly ineffecient.
The objects are each textured with their own square images, matching the 1x1 grid they fill. They likely will never need their own unique shaders, so the only thing that seems to change between objects is the verticies and the shaders.
Is there an effecient way to get each model into the pipeline without flushing because of uniform changes?
This probably requires some try and error procedure an probably is hardware dependent. I would use buffer objects for the meshes with GL_STATIC_DRAW, pack some textures in a bigger one and draw all objects depending on that bigger texture in batch to avoid states changes as much as possible. Profile and get us more information on where is your bottleneck.
I've begun working with android, so I'm developing a little game (mostly for learning purposes). My game has a simple 2d non-scrolling map, and I have many objects that will be placed in the map. The objects are already modeled statically in their classes, and I understand how to shift these into a float buffer and send them to the shaders.
I understand the gist of the model, view, and project matrices, but I've heard that translating in the shader, or passing specific model matrices for each object is inefficient.
How do I, optimally, take the modeled objects, and place them in the appropriate spot on the map (world coordinates)? Where should that translate occur (before or durring shaders? As part of the model matrix?)
(Pseudo-code is sufficient for an answer if necessary.)
Thank you!
It all comes down to the ratio of how many vertices (of a model) come to a single translation (change of transformation). As a general rule, for rendering the bulk of geometry, about at least 100 vertices should be sent for a given uniform if you want to max out the pipeline. Note that if your total number of vertices is relatively small about below 10000, you'll probably not notice any performance penality on modern systems.
So, if your objects are nontrivial, i.e. have a significant number of vertices, changing the uniforms is the way to go. If your objects are simple, placing them into a shared Vertex Array, with an additional vertex attribute indexing into a uniform arrays of transformations will make better use of the pipeline.
It really depends on how complex your objects are.