I have many fixed objects like terrains and buildings and I want to merge them all in one VBO to reduce draw calls and enhance performance when there are too many objects, I load textures and store their ids in an array, my question is can I bind textures to that one VBO or must I make a separate VBO for each texture? or can I make many glDrawArrays for one VBO based on offset and length, if I can do that will this be smooth and well performed?
In ES 2.0, if you want to use multiple textures in a single draw call, your only good option is to use a texture atlas. Essentially, you store the texture data from multiple logical textures in a single OpenGL texture, and the texture coordinates are chosen so that the desired texture data is used for each primitive. This could be done by adjusting the original texture coordinates, or by feeding an id into the shader and applying an offset to the texture coordinates based on the id.
Of course you can use multiple glDrawArrays() calls for a single VBO, with binding a different texture between them. But that goes against your goal of reducing the number of draw calls. You should certainly make sure that the number of draw calls really is a bottleneck for you before you spend a lot of time on these types of optimizations.
In more advanced versions of OpenGL you have additional features that can help with this use case, like array textures.
There are couple of standard techniques that many Game Engines perform to achieve low draw calls.
Batching: This technique combines all objects referring to same material and combines them into one mesh. The objects does not even have to be static. If they are dynamic you can still batch them by passing the Model Matrix array.
Texture Atlas: Creating texture atlas for all static meshes is the best way as said in the other answer. However, you'll have to do a lot of work for combining the textures efficiently and updating their UVs accordingly.
Related
I'm trying to make a 2d map (for a game, think tiled world map) in OpenGL ES 2.0 for an android game. Basically, there are a few tile types that have different textures, and the map is randomly generated from these types, so from game-to-game the map changes but for the duration of a single game it stays the same.
My first thought was to generate a single large texture / image / bitmap (independent from OpenGL) beforehand basically stitching duplicate tile textures together to make the larger map, and then using this single texture for one large map rectangle. In theory I think this is simple and would work fine, but I'm worried that it won't scale well for larger maps and especially on mobile I'll run out of memory with such a large image map. Plus, there's a small set of tiles that are duplicated over and over so it seems like a tremendous waste to duplicate the pixel data in a big texture over and over.
My second thought was having many textures, one for each of the tile textures. But I'm not sure how this would work, texture-binding-wise, would I need the shaders to contain multiple texture references and within the shader have logic for using the right one?
Finally, I thought using a texture atlas could work, have one texture / image with all of the tile data in it, this would be relatively small. But I'm struggling to imagine how to get the maths to work out such that "tiles" or subsections of the map rectangle would use completely different texture coordinates.
Am I approaching this the wrong way? Should I be using a rectangle for each tile? At least this way I can pass the shaders both vertex and texture coordinates for each tile independently. This seems easier, but also seems wrong since the map really is just one rectangle that won't be changing.
My first thought was to generate a single large texture...
Actualy, something like this has already been used in id Software's id Tech since version 4. It's called MegaTexture. Basicaly, it's a big texture, which could also hold additional data.
My second thought was having many textures...
You don't need to hold all the textures in a shader. Do it like this:
Implement a loop with n iterations, where n is how much different types of textures are used.
Inside a loop, bind the current texture type.
Pass any data, like position/color/texture coords to shaders.
Draw all tiles that use the bounded texture. You could use GLES30.glDrawElementsInstanced or GLES30.glDrawArraysInstanced if you are targeting devices with GLES 3.x or an appropriate extension support. Otherwise, draw your tiles using GLES20.glDrawArrays or GLES20.glDrawElements.
Shaders won't be complicated with this approach.
Finally, I thought using a texture atlas could work...
You could use loop here too and compute the texture coordinates for each tile type on CPU, then just pass them to shaders.
Considering your map is not changing through a game session, MegaTexture approach looks good. However, it depends on how large your map is and how much memory is available. Also, note that max texture size is limited. Max size differs from device to device but should be (AFAIK) equal or greater than screen size and at least 64 texels(16 for cube-mapped textures). You can get the maximum texture size on any device using glGet(GL_MAX_TEXTURE_SIZE ).
There are many examples for OpenGL ES 2 in how to visualize a single triangle or rectangle.
Google provides an example for drawing shapes (triangles, rectangles) by creating a Triangle and Rectangle class which basically do all the opengl-stuff required for visualize these objects.
But what should you do, if you have more than one triangle? What if you have objects, consists of hundreds of triangles of different colors, different sizes and positions? I can't find any good tutorial for dealing with complex scenarios in opengl es.
My approaches:
So I tried it out. First of all I've slightely changed the Triangle-Class to a more dynamic class (the constructor now gets the position and the color of the triangle). Basically this is "enough" for drawing complexe scenes. Every object would consist out of hundreds of these Triangle-classes and I render each of them seperately. But this consumes much computing power and I think most of the steps in the rendering process are redundant.
So I tried to "group" triangles into different categories. Now every object has his only vertexbuffer and puts every triangle at once in it. Now the performance is far better than before (where every triangle had his own buffer) but I still think, that it's not the correct way to go.
Is there any good example in the internet, where someone is drawing more than simple triangles or do you know where I can get these information from? I really like OpenGL but it's pretty hard for beginners because of the lack of tutorials (for OpenGL ES 2 in Android).
The standard representation of (triangle) meshes for rendering is using a vertex array containing all the vertices in the mesh, and an index array connecting storing the connectivity (triangles). You definitively want at most one draw call per object (but you might even be able to coalesce several objects).
Interleaved attribute arrays are the most efficient variant wrt. cache efficiency, so one Buffer object for the VA per object is enough. You might even combine several objects into one buffer object, even if you can not use a single draw call for both.
As GLES might be limited to 16 Bit indices, large models must be splitted into several "patches".
Although I'm technically working in the android platform with OpenGL 2.0 ES, I believe this can be applied to more OpenGL technologies.
I have a list of objects (enemies, characters, etc) that I'm attempting to draw onto a grid, each space being 1x1, and each object matching. Presently, each object is self translating... that is, it's taking its model coordinates and going through a simple loop to adjust them to be located in the world coordinates in its appropriate grid location. (i.e. if it should be at (3,2) it will translate it's coordinates accordingly.
The problem I've reached is I'm not sure how to effeciently draw them. I have a loop going through all the objects and calling draw for each object, similar to the android tutorial, but this seems wildly ineffecient.
The objects are each textured with their own square images, matching the 1x1 grid they fill. They likely will never need their own unique shaders, so the only thing that seems to change between objects is the verticies and the shaders.
Is there an effecient way to get each model into the pipeline without flushing because of uniform changes?
This probably requires some try and error procedure an probably is hardware dependent. I would use buffer objects for the meshes with GL_STATIC_DRAW, pack some textures in a bigger one and draw all objects depending on that bigger texture in batch to avoid states changes as much as possible. Profile and get us more information on where is your bottleneck.
I've begun working with android, so I'm developing a little game (mostly for learning purposes). My game has a simple 2d non-scrolling map, and I have many objects that will be placed in the map. The objects are already modeled statically in their classes, and I understand how to shift these into a float buffer and send them to the shaders.
I understand the gist of the model, view, and project matrices, but I've heard that translating in the shader, or passing specific model matrices for each object is inefficient.
How do I, optimally, take the modeled objects, and place them in the appropriate spot on the map (world coordinates)? Where should that translate occur (before or durring shaders? As part of the model matrix?)
(Pseudo-code is sufficient for an answer if necessary.)
Thank you!
It all comes down to the ratio of how many vertices (of a model) come to a single translation (change of transformation). As a general rule, for rendering the bulk of geometry, about at least 100 vertices should be sent for a given uniform if you want to max out the pipeline. Note that if your total number of vertices is relatively small about below 10000, you'll probably not notice any performance penality on modern systems.
So, if your objects are nontrivial, i.e. have a significant number of vertices, changing the uniforms is the way to go. If your objects are simple, placing them into a shared Vertex Array, with an additional vertex attribute indexing into a uniform arrays of transformations will make better use of the pipeline.
It really depends on how complex your objects are.
What is the best way: if I use glDrawArrays, or if I use glDrawElements? Any difference?
For both, you pass OpenGL some buffers containing vertex data.
glDrawArrays is basically "draw this contiguous range of vertices, using the data I gave you earlier".
Good:
You don't need to build an index buffer
Bad:
If you organise your data into GL_TRIANGLES, you will have duplicate vertex data for adjacent triangles. This is obviously wasteful.
If you use GL_TRIANGLE_STRIP and GL_TRIANGLE_FAN to try and avoid duplicating data: it isn't terribly effective and you'd have to make a rendering call for each strip and fan. OpenGL calls are expensive and should be avoided where possible
With glDrawElements, you pass in buffer containing the indices of the vertices you want to draw.
Good
No duplicate vertex data - you just index the same data for different triangles
You can just use GL_TRIANGLES and rely on the vertex cache to avoid processing the same data twice - no need to re-organise your geometry data or split rendering over multiple calls
Bad
Memory overhead of index buffer
My recommendation is to use glDrawElements
The performance implications are probably similar on the iphone, the OpenGL ES Programming Guide for iOS recommends using triangle strips and joining multiple strips through degenerate triangles.
The link has a nice illustration of the concept. This way you could reuse some vertices and still do all the drawing in one step.
For best performance, your models should be submitted as a single unindexed triangle strip using glDrawArrays with as few duplicated vertices as possible. If your models require many vertices to be duplicated (because many vertices are shared by triangles that do not appear sequentially in the triangle strip or because your application merged many smaller triangle strips), you may obtain better performance using a separate index buffer and calling glDrawElements instead. There is a trade off: an unindexed triangle strip must periodically duplicate entire vertices, while an indexed triangle list requires additional memory for the indices and adds overhead to look up vertices. For best results, test your models using both indexed and unindexed triangle strips, and use the one that performs the fastest.
Where possible, sort vertex and index data so that that triangles that share common vertices are drawn reasonably close to each other in the triangle strip. Graphics hardware often caches recent vertex calculations, so locality of reference may allow the hardware to avoid calculating a vertex multiple times.
The downside is that you probably need a preprocessing step that sorts your mesh in order to obtain long enough strips.
I could not come up with a nice algorithm for this yet, so I can not give any performance or space numbers compared to GL_TRIANGLES. Of course this is also highly dependent on the meshes you want to draw.
Actually you can degenerate the triangle strip to create continuous strips so that you don't have to split it while using glDrawArray.
I have been using glDrawElements and GL_TRIANGLES but thinking about using glDrawArray instead with GL_TRIANGLE_STRIP. This way there is no need for creating inidices vector.
Anyone that knows more about the vertex cache thing that was mentioned above in one of the posts? Thinking about the performance between glDrawElements/GL_TRIANGLE vs glDrawArray/GL_TRIANGLE_STRIP.
The accepted answer is slightly outdated. Following the doc link in Jorn Horstmann's answer, OpenGL ES Programming Guide for iOS, Apple describes how to use "degenerate-triangles" trick with DrawElements, thereby gaining the best of both worlds.
The minor savings of a few indices by using DrawArrays isn't worth the savings you get by combining all your data into a single GL call, DrawElements. (You could combine all using DrawArrays, but then any "wasted elements" would be wasting vertices, which are much larger than indices, and require more render time too.)
This also means you don't need to carefully consider all your models, as to whether most of them can be rendered as a minimal number of strips, or they are too complex. One uniform solution, that handles everything. (But do try to organize into strips where possible, to minimize data sent, and maximize GPU likeliness of re-using recently cached vertex calculations.)
BEST: A single DrawElements call with GL_TRIANGLE_STRIP, containing all your data (that is changing in each frame).